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(57) Abstract 

The present invention is directed to the production of an antipathogenic substance (APS) in a host via recombinant expression of 
the polypeptides needed to biologically synthesize the APS. Genes encoding polypeptides necessary to produce particular antipathogenic 
substances are provided, along with methods for identifying and isolating genes needed to recombinantly biosyn^esize any desired APS. 
The cloned genes may be transformed and expressed in a desired host organisms to produce the APS according to the invention for a 
variety of puiposes, including protecting the host firom a pathogen, developing the host as a Inocontrol agent, and producing large uniform 
amounts of the APS. 
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.1 . 

GENES FOR THE SYNTHESIS OF ANTIPATHOGENIC SUBSTANCES 

The present invention relates generally to the protection of host organisms against 
pathogens, and more particularly to the protection of plants against phytopattiogens. In 
one aspect It provides transgenic plants which have enhanced resistance to 
phytopathogens and biocontrol organisms with enhanced biocontrol properties. It further 
provides methods for protecting plants against phytopathogens and methods for the 
production of antipathogenic substances. 

Plants routinely become infected by fungi and bacteria, and many microbial species have 
evolved to utilize the different niches provided by the growing plant Some phytopathogens 
have evolved to infect foliar surfaces and are spread through the air. from plant-to-plant 
contact or by various vectors, whereas other phytopathogens are soil-borne and 
preferentially Infect roots and newly genninated seedlings. In addition to infection by fungi 
and bacteria, many fdant diseases are caused by nematodes which are soil-borne and 
infect roots, typically causing serious dams^e when the same crop species is cultivated for 
successive years on the same area of ground. 

Plant diseases cause considerable crop loss from year to year resulting both in economic 
hardship to farmers and nutritional deprivation for local populations in many parts of the 
wortd. The widespread use of fungicides has provided consMeFabie security against 
phytopathogen attadc. but despite $1 billion worth of expenditure on fungicides, woridvnde 
crop losses amounted to approximatety 10% of crop value in 1981 (James. Seed Sd. & 
Technoi. g: 679-685 (1981). The severity of the destructive process of disease depends on 
the aggressiveness of the phytopathogen and the response of the host, and one aim of 
most plant breeding programs is to increase the resistance of host plemts to disease. Novel 
gene sources and combinations devetoped for resistance to disease have typteally only had 
a limited period of successful use in many crop-pathogen systems due to the rapid 
evolution of phytopathpgens to overcome resistance genes. In addition, there are several 
documented cases of the evolution of fungal strains which are resistant to particular 
fungicides. As eariy as 1981. Retcher and Wolfe (Proc. 1981 Brit Crop Prot Conf. (1981)) 
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contended that 24% of the powdery mildew populations from spring barley, and 53% from 
winter barley showed considerable variation in response to the fungicide triadimenol and 
that the distribution of these populations varied between bariey varieties with the most 
susceptible variety also giving the highest incidence of less susceptible fungal types. 
Similar variation in the sensitivity of fungi to fungicides has been documented for wheat 
mildew (also to triadimenol). Botrytis (to benomyl), Pyrenophora (to organomercury), 
Pseudocercosporella (to MBC-type fungicides) and Mycosphaerella fijiensis to triazoles to 
mention just a few (Jones and Clifford; Cereal Diseases, John Wiley. 1983). Diseases 
caused by nematodes have also been controlled successfully by pesticide application. 
Whereas most fungicides are relatively harmless to mammals and the problems with their 
use lie in the development of resistance in target fungi, the major problem associated with 
the use of nematicides is their relatively high toxicity to mammals. Most nematicides used 
to control soil nematodes are of the carbamate, organochlorine or organophosphorous 
groups and must be applied to the soil with particular care. 

In some crop species, the use of biocontrol organisms has been developed as a further 
altemative to protect crops. Biocontrol organisms have the advantage of being able to 
colonize and protect parts of the plant inaccessible to conventional fungicides. This 
practice developed from the recognition that crops grown in some soils are naturally 
resistant to certain fungal phytopathogens and that the suppressive nature of these soils is 
lost by autoclaving. Furthermore, it was recognized that soils which are conducive to the 
development of certain diseases could be rendered suppressive by the addition of small 
quantities of soil from a suppressive field (Scher et al. Phytopathology 70: 412-417 (1980). 
Subsequent research demonstrated that root colonizing bacteria were responsible for this 
phenomenon, now known as biological disease control (Baker et al. Biological Control of 
Plant Pathogens. Freeman Press, San Francisco, 1974). In many cases, the most efficient 
strains of biological disease controlling bacteria are of the species Pseudomonas 
fluorescens (Weller et al. Phytopathology 73: 463-469 (1983); Kloepper et al. 
Phytopathology 71: 1020-1024 (1981)). Important plant pathogens that have been 
effectively controlled by seed inoculation with these bacteria include Gaemannomyces 
graminis, the causative agent of take-ail in wheat (Cook etaL Soil Biol. Biochem 8: 269-273 
(1976)) and the Pythlum and Rhizoctonia phytopathogens involved in damping off of cotton 
(IHowell et al. Phytopathology 69: 480-482 (1979)). Several biological disease controlling 
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Pseudomonas strains produce antibiotics which inhibit the growth of fungal phytopathogens 
(Howell etal. Phytopathology 69: 480-482 (1979); Howell etal. Phytopathology 70: 712-715 
(1980)) and these have been implicated in the control of fungal phytopathogens in the 
rhizosphere. Although biocontroi was initially believed to have considerable promise as a 
method of widespread aqspiication for disease control, it has found application mainly in the 
environment of glasshouse crops where its utility in controlling soil-bome phytopathogens Is 
best suited for success. Large scale field application of naturally occurring microorganisms 
has not proven possible due to constraints of microorganism production (they are often slow 
growing), distribution (they are often short lived) and cost (the result of both these 
problems). In addition, the success of biocontroi approaches is also largely limited by the 
identification of naturally occuning strains which may have a limited spectrum of efficacy. 
Some initial approaches have also been taken to control nematode phytopathogens using 
biocontroi organisms. Although these approaches are still exploratory, some Streptomyces 
species have been reported to control the root laiot nematode {Meliodogyne spp.) (WO 
93/18135 to Research Corporation Technology), and toxins from some Bacillus 
thuringiensis strains (such as israeliensis) have been shown to have broad anti-nematode 
activity and spore or bacillus preparations may thus provide suitable biocontroi opportunities 
(EP 0 352 052 to Mycogen. WO 93/19604 to Research Corporation Technologies). 

The traditional methods of protecting crops against disease, including plant breeding for 
disease resistance, the continued development of fungicides, and more recently, the 
identification of biocontroi organisms, have all met with success. It is apparent, however, 
that scientists must constantly be in search of new methods with whidi to protect crops 
against disease. This invention provides novel methods for the protection of plants against 
phytopathogens. 

The present invention reveals the genetic basis for substances produced by particular 
microorganisms via a multi-gene biosynthetic pathway which have a deleterious effect on 
the multiplication or growth of plant pathogens. These substances include cari30hydrate 
containing antibiotics such as aminoglycosides, peptide antibiotics, nucleoside derivatives 
and other heterocyclic antibiotics containing nitrogen and/or oxygen, polyketides, 
macrocyclic lactones, and quinones. 
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The invention provides the entire set of genes required for recombinant production of 
particular antipathogenic substances in a host organism. It further provides methods for the 
manipulation of APS gene sequences for their expression in transgenic plants. The 
transgenic plants thus modified have enhanced resistance to attack by phytopathogens. 
The invention provides methods for the cellular targeting of APS gene products so as to 
ensure that the gene products have appropriate spatial localization for the availability of the 
required substrate/s. Further provided are methods for the enhancement of throughput 
through the APS metabolic pathway by overexpression and overproduction of genes 
encoding substrate precursors. 

The invention further provides a novel method for the identification and isolation of the 
genes involved in the biosynthesis of any particular APS in a host organism. 
The invention also describes improved biocontrol strains which produce heterologous APSs 
and which are efficacious in controlling soil-borne and seedling phytopathogens outside the 
usual range of the host 

Thus, the invention provides methods for disease control. These methods involve the use 
of transgenic plants expressing APS biosynthetic genes and the use of biocontrol agents 
expressing APS genes* 

The invention further provides methods for the production of APSs in quantities large 
enough to enable their isolation and use in agricultural formulations. A specific advantage 
of these production methods is the unifonn chirality of the molecules produced; production 
in transgenic organisms avoids the generation of populations of racemic mixtures, within 
which some enantiomers may have reduced activity. 

DEFINITIONS 

As used in the present application, the following temis have the meanings set out below. 
Antipathogenic Substance: A substance which requires one or more nonendogenous 
enzymatic activities foreign to a plant to be produced in a host where it does not naturally 
occur, which substance has a deleterious effect on the multiplication or growth of a 
pathogen (i.e. pathogen). By " nonendogenous enzymatic activities" is meant enzymatic 
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activities that do not naturally occur in the host where the antipathogenic substance does 
not naturally occur, A pathogen may be a fungus, bacteria, nematode, virus, viroid, insect 
or combination thereof, and may be the direct or indirect causal agent of disease in the host 
organism. An antipathogenic substance can prevent the multiplication or growth of a 
phytopathogen or can kill a phytopathogen. An antipathogenic substance may be 
synthesized from a substrate which naturally occurs in the host. Alternatively, an 
antipathogenic substance may be synthesized from a substrate that Is provided to the host 
along with the necessary nonendogenous enzymatic activities. An antipathogenic 
substance may be a carbohydrate containing antibiotic, a peptide antibiotic, a heterocyclic 
antibiotic containing nitrogen, a heterocyclic antibiotic containing oxygen, a heterocyclic 
antibiotic containing nitrogen and oxygen, a polyketide, a macrocyclic lactone, and a 
quinone. Antipathogenic substance is abbreviated as "APS" throughout the text of this 
application. 

Anti-phytopathogenic substance: An antipathogenic substance as herein defined which has 
a deleterious effect on the multiplication or growth of a plant pathogen (i.e.phytopathogen). 

Biocontrol agent: An organism which is capable of affecting the growth of a pathogen such 
that the ability of the pathogen to cause a disease is reduced. Biocontrol agents for plants 
include microorganisms which are cap2A)le of colonizing plants or the rhizosphere. Such 
biocontrol agents include gram-negative microorganisms such as Pseudomonas, 
Enterobacter and Serratia, the gram-podtive microorganism Bacillus and the fungi 
Trichoderma and Gliodadium. Organisms may act as biocontrol agents in their native state 
or when they are genetically engineered according to the invention. 

Pathogen: Any organism which causes a deleterious effect on a selected host under 
appropriate conditions. Within the scope of this invention tiie term pathogen is intended to 
include fungi, bacteria, nematodes, viruses, viroids and insects. 

Promoter or Regulatory DMA Sequence: An untranslated DNA sequence which assists in, 
enhances, or otiienAfise affects tiie transcription, translation or expression of an associated 
stnjctural DNA sequence which codes for a protein or ottier DNA product The promoter 
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DNA sequence is usually located at the 5* end of a translated DNA sequence, typically 
between 20 and 100 nucleotides from the 5' end of the translation start site. 

Coding DNA Sequence: A DNA sequence that is translated in an organism to produce a 
protein. 

Operably Linked to/Associated With: Two DNA sequences which are "associated" or 
-operably linked" are related physically or functionally. For example, a promoter or 
regulatory DNA sequence is said to be "associated with" a DNA sequence that codes for an 
RNA or a protein if the two sequences are operably linked, or situated such that the 
regulator DNA sequence will affect the expression level of the coding or structural DNA 
sequence. 

Chimeric Construction/Fusion DNA Sequence: A recombinant DNA sequence in which a 
promoter or regulatory DNA sequence is operably linked to, or associated with, a DNA 
sequence that codes for an mRNA or which is expressed as a protein, such that the 
regulator DNA sequence is able to regulate transcription or expression of the associated 
DNA sequence. The regulator DNA sequence of the chimeric construction is not nomnaily 
operably linked to the assodated DNA sequence as found in nature. The terms 
"heterologous" or "non-cognate" are used to indicate a recombinant DNA sequence in which 
the promoter or regulator DNA sequence and the associated DNA sequence are isolated 
from organisms of different species or genera. 



BRIEF DESCRIPTION OF THE FIGURES 

Figure 1: Restriction map of the cosmid clone pCIB169 from Pseudomonas fluorescens 
carrying the pyrrolnitrin biosynthetic gene region. Restricition sites of the 
enzymes EcoRI, Hindlll, Kpnl, NotI, SphI, and Xbal as well as nucleotide 
positions in kbp are indicated. 

Figure 2: Functional Map of the Pyn-olnitrin Gene Region of M0CG134 indicating insertion 
points of 30 independent Tn5 insertions along the length of pCIB169 for the 
identification of the genes for pyaolnitrin biosynthesis. EcoRI restriction sites are 
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designated with E, NotI sites with N. The effect of a Tn5 insertion on pm 
production is designated with either + or wherein + indicates a pm producer 
and - a pm non-producer. 
Figures: Restriction map of the 9.7 l<b MOCG134 Pm gene region of clone pCIB169 
involved in pyn-olnitrin biosynthesis. EcoRI restriction sites are designated with 
E, NotI sites with N, and l-iindlll sites with H. Nucleotide positions are indicated 
in l(bp. 

Figure 4: Location of various subclones derived from pCIB169 isolated for sequence 
determination purposes. 

Figures: Localization of the four open reading frames (ORFs 1-4) responsible for 
pyn^oinitrin biosynthesis in strain I^CG134 on the -6 kb Xbal/NotI irzgment of 
pCIB169 comprising the Pm gene region. 

Figure 6: Location of the fragments deleted in ORFs 1-4 in the pynrolnltrin gene cluster of 
MOCG134. Deleted fragments are indicated as filled boxes. 

Figure 7: Restriction map of the cosmid clone p98/1 from Sorangium cellulosum canying 
the soraphen biosynthetic gene region. The top line depicts the restriction map 
of p98/1 and shows the position of restriction sites and their distance from the 
left edge in kilobases. Restriction sites shown include: Bam HI; Bg Bgl 11; E, 
Eco Rl; H, Hind 111; Pv, Pvu I; Sm, Sma I. The boxes below the restri(^on map 
depict the location of the biosynthetic modules. The activity domains within 
each module are designated as follows: p-ketoacylsynthase (KS), 
Acyltransferase (AT). Ketoreductase (KR). Acyl Carrier Protein (ACP). 
Dehydratase (DH), EnoyI reductase (ER), and Thioesterase (TE). 

Figure 8: Constmction of pCIBI 32 from pSUP2021 . 

Figures: Restriction endonuclease map of the phenazine bios^thetic gene cluster 
contained on a 5.7 kb EcoRI-Hindlll fragment. Orientation and approximate 
positions of the six open reading frames are presented below the restriction 
map. 0RF1 , which is not entirely present within the 5.7 kb fragment, encodes a 
product with significant homology to plant DAHP synthases. 0RF2 (0.65 kb), 
ORFS (0.75 kb), and 0RF4 (1.15 kb) have domains homologous to 
isochorismatase, anthranilate synthase large subunit. and anthranilate synthase 
small subunit, respectively. 0RF5 (0.7 kb) demonstrates no homology with 
database sequences. The 0RF6 (0.65 kb) product has end to end homology 
with the gene encoding pyridoxine 5'-phosphate oxidase In E. coli. 
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BRIEF DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE USTING 

SEQ ID N0:1 : Sequence of the Pyrrolnitrin Gene Cluster 

SEQ ID NO:2: Protein sequence for 0RF1 of pyrrolnitrin gene cluster 

SEQ ID N0:3: Protein sequence for 0RF2 of pyn-olnitrin gene cluster 

SEQ ID N0:4: Protein sequence for ORF3 of pyrrolnitrin gene cluster 

SEQ ID N05: Protein sequence for 0RF4 of pyrrolnitrin gene cluster 

SEQ ID N0:6: Sequence of the Soraphen Gene Cluster 

SEQ ID N0:7: Sequence of a Plant Consensus Translation Initiator (Clontech) 

SEQ ID N0:8: Sequence of a Plant Consensus Translation Initiator (Josh!) 

SEQ ID N0:9: Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID NO:10: Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID N0:1 1 : Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID N0:12: Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID N0:13: Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID N0:14: Sequence of an Oligonucleotide for Use in a Molecular Adaptor 

SEQ ID N0:1 5: Oligonucleotide used to change restriction site 

SEQ ID N0:16: Oligonucleotide used to change restriction site 

SEQ ID N0:17: Sequence of the Phenazine Gene Cluster 

SEQ ID N0:18: Protein sequence for phzl from the phenazine gene cluster 

SEQ ID N0:19: Protein sequence for phz2 from the phenazine gene cluster 

SEQ ID NO:20: Protein sequence for phz3 from ttie phenazine gene cluster 

SEQ ID N0:21 : DNA sequence for phz4 of Phenazine gene cluster 

SEQ ID NO:22: Protein sequence for phz4 from the phenazine gene cluster 
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Production of Antipathogenic Substances by Microorganisms 

Many organisms produce secondary metabolites and some of tliese inhibit tiie growth of 
other organisms. Since the discovery of penicillin, a large number of compounds with 
antibiotic acti>^ty have been identified, and the number continues to increase with ongoing 
screening efforts. Antibiotically active metabolites comprise a broad range of chemical 
stmctures. The most important include: aminoglycosides {e.g. streptomydn) and other 
carbohydrate containing antibiotics, peptide antibiotics {e.g. p-iactAPS, rtiizocticin {see 
Rapp, C. et ai, Liebigs Ann. Chem. : 655-661 (1988)), nucleoside derivatives {e.g. 
blasticidin S) and other heterocyclic antibiotics containing nitrogen {e.g. phenazine and 
pyrrolnitrin) and/or oxygen, polyketides {e.g. soraphen), macrocydic lactones {e.g. 
erythromycin) and quinones {e.g. tetracycline). 

AminoQivcosides and Other Carbohydrate Containing Antibiotics 

The aminoglycosides are oligosaccharides consisting of an aminocydohexanol moiety 
glycosidicaliy linked to other amino sugars. Streptomycin, one of the best studied of the 
group, is produced by Streptomyces griseus. The biochemistry and biosynttiesis of this 
compound is complex (for review see Mansouri et al. in: Genetics and Molecular Biology of 
Industrial Microorganisms {ed.: Hershberger et al.), American Society for Microbiology, 
Washington, D. C. pp 61-67 (1989)) and involves 25 to 30 genes. 19 of which have been 
analyzed so far (Retzlaff et al. in: Industrial Microorganisms: Basic and Applied Molecular 
Genetics (ed: Baltz ef a/.), American Sodety for Microbiology, Washington, D. C. pp 183- 
194 (1993)). Streptomydn, and many other aminoglycosides, inhibits protein synthesis in 
the target organisms. 

Peptide Antibiotics 

Peptide antibiotics are classifiable into two groups: (1) those which are synthesized by 
enzyme systems without the partidpation of the ribosomai apparatus, and (2) those which 
require the ribosomaliy*mediated translation of an mRNA to pro^de the precursor of the 
antibiotic. 

Non-Ribosomal Peptide Antibiotics are assembled by large, muttifunctional enzymes 
which activate, modify, polymerize and in some cases cydize the subunit amino adds, 
forming polypeptide diains. Ottier adds, such as aminoadipic add, diaminobutyric add, 
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diaminopropionic acid, dihydroxyamino acid, isoserine, dihydroxybenzoic acid, 
hydroxyisovaleric acid, (4R)-4-[(E)-2-butenyl]-4,N-dimethyl-L-threonine. and ornithine are 
also incorporated (Katz & Demain, Bacteriological Review 41: 449-474 (1977); Kleinkauf & 
von Dohren, Annual Review of Microbiology 41' 259-289 (1987)). The products are not 
encoded by any mRNA. and ribosomes do not directly participate in their synthesis. 
Peptide antibiotics synthesized non-ribosomally can in turn be grouped according to their 
general structures into linear, cyclic, lactone, branched qrclopeptide, and depsipeptide 
categories (Kleinkauf & von Dohren. European Journal of Biochemistry 192: 1-15 (1990)). 
These different groups of antibiotics are produced by the action of modifying and cyclizing 
enzymes; the basic scheme of polymerization is common to them all. Non-ribosomally 
synthesized peptide antibiotics are produced by both bacteria and fungi, and include 
edeine, linear gramicidin, tyrocidine and gramicidin S from Badllus brevis, mycobacillin from 
Bacillus subtllls, polymyxin from Bacillus polymiyxa, etamycin from Streptomyces griseus, 
echinomycin from Streptomyces echinatus, actinomycin from Streptomyces clavuligerus, 
enterochelin from Escfiericfiia coli, gamma-(alpha-L-aminoadipyl)-L-cysteinyl-D-valine (ACV) 
from Aspergillus nidulans, alamethidne from Trichoderma viride, destruxin from Metarhisum 
anisolpliae. enniatin from Fusarium oxysporum, and beauvericin from Beauveria bassiana. 
Extensive functional and structural similarity exists between the prokaryotic and eukaryotic 
systems, suggesting a comnion origin for both. The activities of peptide antibiotics are 
similarly broad, toxic effects of different peptide antibiotics in animals, plants, bacteria, and 
fungi are known (Hansen, Annual Review of Microbiology 47: 535-564 (1993); Katz & 
Demain, Bacteriological Reviews 41: 449-474 (1977); Kleinkauf & von Dohren, Annual 
Review of Microbiology 41: 259-289 (1987); Kleinkauf & von Dohren, European Joumai of 
Biochemistry 192: 1-15 (1990); Kolter & Moreno, Annual Re>riew of Aerobiology 46: Mi- 
les (1992)). 

Ribosomally-Synthesized Peptide Antibiotics are characterized by the existence of a 
structural gene for the antibiotic itself, which encodes a precursor that is modified by 
specific enzymes to create the mature molecule. The use of the general protein synthesis 
apparatus for peptide antibiotic synthesis opens up the possibility for much longer polymers 
to be made, although these peptide antibiotics are not necessarily very large. In addition to 
a structural gene, further genes are required for extracellular secretion and Immunity, and 
these genes are believed to be located dose to tiie structural gene, in most cases probably 
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on the same operon. Two major groups of peptide antibiotics made on ribosomes exist: 
those which contain the unusual amino acid lanthionine. and those which do not. 
Lanthionine-containing antibiotics (lantibiotics) are produced by gram-positive bacteria, 
including species of Lactococcus, Staphylococcus, Streptococcus, Bacillus, and 
Streptomyces. Linear lantibiotics (for example, nisin, subtilin. epidermin, and gallidemiin), 
and circular lantibiotics (for example, duramycin and cinnamycin), are known (Hansen, 
Annual Review of Microbiology 47: 535-564 (1993); Kotter & Moreno, Annual Review of 
Microbiology 46: 141-163 (1992)). Lantibiotics often contain other characteristic modified 
residues such as dehydroalanine (DHA) and dehydrobutyrine (DHB), which are derived 
from the dehydration of serine and threonine, respectively. The reaction of a thiol from 
cysteine with DHA yields lanttiionine, and with DHB yields p-metiiyllanthionine. Peptide 
antibiotics which do not contain lanthionine may contain other modifications, or they may 
consist only of the ordinary amino acids used in protein syntiiesis. Non-lanthionine- 
containing peptide antibiotics are produced by both gram-positive and gram-negative 
bacteria, including Lactobacillus, Lactococcus, Pediococcus, Enterococcus, and 
Escherichia. Antibiotics in this category include lactacins, lactocins, sakacin A, pediocins, 
diplococdn, tactococcins. and microcins (Hansen, supra; Kolter & Moreno, supra). 

Nucleoside Derivatives and Other Heterocvclic Antibiotics Containing Nitrogen and/or 
Oxygen 

These compounds ail contain heterocyclic rings but are otherwise structurally diverse and, 
as Illustrated in the following examples, have very different biological activities. 

Polyoxins and Nikkonfiyclns are nucleoside derivatives and structurally resemble UDP-N- 
acetylglucosamine, the substrate of chitin synthase. They have been klentified as 
competitive inhibitors of chitin synthase (Gooday, in: Biochemistry of Cell Walls and 
Membranes in Fungi (ed.: Kuhn etal.), Springer-Veriag, Beriin p. 61 (1990)). The polyoxins 
are produced by Streptomyces cacao/ and the Nikkomydns are produced by 5. tendae. 

Phenazines are nitrogen-containing heterocyclic compounds with a common planar 
aromatic tricyclic structure. Over 50 naturally occurring phenazines have been identified, 
each differing in the substituent groups on the basic ring structure. This group of 
compounds are found produced in nature exclusively by bacteria, in particular 
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Streptomyces, Sorangium, and Pseudomonas ( for review see Turner & Messenger, 
Advances in Microbiol Physiology 27: 211-275 (1986)). Recently, the phenazine 
biosynthetic genes of a P. aureofaciens strain has been isolated (Pierson & Thomashow 
MPMI 5: 330-339 (1 992)). Because of their planar aromatic structure, it has been proposed 
that phenazines may fonn intercalative complexes with DNA (Hoilstein & van Gemert, 
Biochemistry 10: 497 (1971)), and thereby interfere with DNA metabolism. The phenazine 
myxin was shown to intercalate DNA (Hoilstein & Butler, Biochemistry JM: 1345 (1972)) and 
the phenazine lomofungin was shown to inhibit RNA synthesis in yeast (Cannon & Jiminez, 
Biochemical Journal 142: 457 (1 974); Ruet et al.. Biochemistry 14- 4651 (1 975)). 

Pyrrolnitrin is a phenylpyn-ole derivative with strong antibiotic activity and has been shown 
to inhibit a broad range of fungi (Homma et aL, Soil Biol. Biochem. 2i: 723-728 (1989); 
Nishida et aL, J. Antibiot, ser A. 18: 211-219 (1965)). It was originally isolated from 
Pseudomonas pyrrocinia (Arima et al, J. AntibioL, ser. A, 18: 201-204 (1965)), and has 
since been isolated from several other Pseudomonas species and Myxococcus species 
(Gerth etaL J. Antibiot. 35: 1101-1103 (1982)). The compound has been reported to inhibit 
fungal respiratory electron transport (Tripathi & Gottlieb, J. Bacteriol. 100: 310-318 (1969)) 
and uncouple oxidative phosphorylation (Lambowitz & Slayman. J. Bacteriol. 112: 1020- 
1022 (1972)). It has also been proposed that pyrrolnitrin causes generalized lipoprotein 
membrane damage (Nose & Arima, J. Antibiot., ser A, 22: 135-143 (1969); Carione & 
Scannerini, Mycopahtologia et Mycologia Applicata 53: 111-123 (1974)). Pyrrolnitrin is 
biosynthesized from tryptophan (Chang et al. J. Antibiot. 34- 555-566) and the biosyntfietic 
genes from P. fluorescens have now been cloned (see Section C of examples). Thus, one 
embodiment of the present invention relates to an isolated DNA molecule encoding one or 
more polypeptides for the biosynthesis of pyrrolnitrin in a heterologous host, which molecule 
can be used to genetically engineer a host organism to express said antipathogenic 
substance. Other embodiments of the invention are the isolated polypeptides required for 
the biosynthesis of pyn^olnitrin. 

Polvketide Synthases 

Many antibiotics, in spite of the apparent structural diversity, share a common pattern of 
biosynthesis. The molecules are built up from two carbon building blocks, the p-cari3on of 
which always cames a keto group, thus the name polykeb'de. The tremendous structural 



wo 95/33818 



PCT/IB9S/Q0414 



-13- 

diversity derives from the different lengths of the polyketide chain and the different side- 
chains introduced, either as part of the two carbon building blocks, or after the polyketide 
backbone is formed. The keto groups may also be reduced to hydroxyls or removed 
altogether. Each round of two carbon addition is canied out by a complex of enzymes 
called the polyketide synthases (PKS) in a manner similar to fatty acid biosynthesis. The 
biosynthetic genes for an increasing number of polyketide antibiotics have been isolated 
and sequenced. It is quite apparent that the PKS genes are structurally conserved. The 
encoded proteins generally fall into two types: type I proteins are polyfunctional, with 
several catalytic domains carrying out different enzymatic steps covalently linked together 
(e.g. PKS for erythromycin, soraphen, and avermectin (Joaua et ai Plasmid 28: 157-165 

(1992) ; MacNeil et al In: Industrial Microorganisms: Basic and Applied Molecular Genetics, 
(ed.: Baltz et ai), American Society for Microbiology, Washington D. C. pp. 245-256 

(1993) ); whereas type II proteins are monofunctional (Hutchinson et ai in: Industrial 
Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz ef a/.), American Society 
for Microbiology, Washington 0. C. pp. 203-216 (1993)). For the simpler polyketide 
antibiotics such as actinorhodin (produced by Streptomyces coelicoloi), the several rounds 
of two cart>on additions are carried out iteratively on PKS enzymes encoded by one set of 
PKS genes. In contrast, synthesis of the more complicated compounds such as 
erythromycin and soraphen (see Section E of examples) involves sets of PKS genes 
organized into modules, with each module carrying out one round of two cari3on addition 
(for review see Hopwood ef a/, in: Industrial Microorganisms: Basic and Applied Molecular 
Genetics, (ed.: Baltz et ai), American Society for Microbiology, Washington D. C. pp. 267- 
275 (1993)). The present invention provides the biosynthetic genes of soraphen from 
Sorangium (see Section E of examples). Thus, another embodiment of the present 
Invention relates to an isolated DNA molecule encoding one or more polypeptides for the 
biosynthesis of soraphen in a heterologous host which molecule can be used to genetically 
engineer a host organism to express said antq^athogenic substance. Other embodiments of 
the invention are Isolated polypeptides required for the biosynthesis of soraphen. 

Macrocvclic Lactones 

This group of compounds shares the presence of a large lactone ring with various ring 
substituents. They can be further classified into subgroups, depending on the ring size and 
other characteristics. The macrolides, for example, contain 12-, 14-, 16-, or 17-membered 
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lactone rings glycosidically linked to one or more aminosugars and/or deoxysugars. Tliey 
are inhibitors of protein synthesis, and are particularly effective against gram-positive 
bacteria. Erythromycin A, a well-studied macrolide produced by Sacchampolyspora 
erythraea, consists of a 14-membered lactone ring linked to two deoxy sugars. Many of the * 
biosynthetic genes have been cloned; all have been located within a 60 kb segment of the 
S. erythraea chromosome. At least 22 closely linked open reading frames have been 
identified to be likely involved in erythromycin biosynthesis (Donadio et ai, in: Industrial 
Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz etaL), American Society 
for Microbiology, Washington D. C. pp 257-265 (1993)). 

Quinones 

Quinones are aromatic compounds with two carbonyl groups on a fully unsaturated ring. 
The compounds can be Iwoadly classified into subgroups according to the number of 
aromatic rings present, Le., benzoquinones, napthoquinones. etc. A well studied group is 
the tetracyclines, which contain a napthacene ring with different substituents. Tetracyclines 
are protein synthesis inhibitors and are effective against both gram-positive and gram- 
negative bacteria, as well as rickettslas, mycoplasma, and spirochetes. The aromatic rings 
in the tetracyclines are derived from polyketlde moteoiles. Genes involved in the 
biosynthesis of oxytetracycline (produced by Streptomyces rimosus) have been cloned and 
expressed In Streptomyces Bvidans (Binnie etal. J. Bacterid. 171: 887-895 (1989)). The 
PKS genes share homology with those for actinortiodin and therefore encode type II 
(monofuncMonal) PKS proteins (Hopewood & Sherman, Ann. Rev. Genet. 24: 37-66 
(1990)). 

OtherTvpesof APS 

Several other types of APSs have been identified. One of these is the antbiotic 2-hexyl-5- 
propyl-resorcinol which is produced by certain strains of Psetxiomonas. It was first isolated 
from the Psei/otomonas strain B-9004 (i<anda etal. J. AntibloL 28: 935-942 (1975)) and is a 
dialkyi-substituted derivative of 1,3-dihydroxybenzene. It has been shown to have 
antlpathogenic activity against Gram-positive bacteria (In particular C/ai/ftacter ^.), 
mycobacteria, and fungi. 

Another type of APS are the methoxyacrylates, such as strobilurin B. Strobilurin B is 
produced by Basidiomycetes and has a broad spectmm of fungicidal activity (Anke. T. ef 
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aL, Journal of Antibiotics (T okyo) 30: 806-81 0 (1 977). In particular, strobilurin B is produced 
by the fungus Bolinia lutea. Strobilurin B appears to have antifungal activity as a result of 
its ability to inhibit cytochrome b dependent electron transport thereby inhibiting respiration 
(Becker. W. etal.. FEBS Letters 732:329-333 (1981). 

Most antibiotics have been isolated from bacteria, actinomycetes. and fungi. Their role in 
the biology of the host organism is often unknown, but many have been used with great 
success, both in medicine and agriculture, for the control of microbial pathogens. 
Antibiotics which have been used in agriculture are: blasticidin S and kasugamycin for the 
control of rice blast (Pyricularia oryzae). validamycin for the control of Rhizoctonia solani, 
prumycin for the control of Botrytis and Sclerotinia species, and mildiomydn for the control 
of mildew. 

To date, the use of antibiotics in plant protection has involved the production of the 
compounds through chemical synthesis or fermentation and application to seeds, plant 
parts, or soil. This invention describes the Identification and isolation of the biosynthetic 
genes of a number of anti-phytopathogenic substances and further describes the use of 
these genes to create transgenic plants with enhanced disease resistance characteristics 
and also the creation of improved biocontrol strains by e)q:)ression of the isolated genes in 
organisms which colonize host plants or the rhzo^here. Furthermore, the availability of 
such genes provides methods for the production of APSs for isolation and application in 
antipathogenic formulations. 

Methods for Cloning Genes for Anttpattiogenic Substances 

Genes encoding antibiotic biosynthetic genes can be doned using a variety of tediniques 
according to the invention. The simplest procedure for the doning of APS genes requires 
the doning of genomic DMA from an organism identified as produdng an APS, and ttte 
transfer of the doned DNA on a suitable plasmid or vector to a host organism which does 
not produce tiie APS, followed by the identification of transfomned host colonies to which 
the APS-produdng ability has been conferred. Using a technique such as XrrTnS 
transposon mutagenesis (de Bniijn & Lupski, Gene 27: 131-149 (1984)), the exact region of 
tiie transforming APS-conferring DNA can be more predsety defined. Alternatively or 
additionally, the transforming APS-confemng DNA can be deaved into smaller fragments 
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and the smallest which maintains the APS-conferring ability further characterized. Whereas 
the host oi^anism lacking the ability to produce the APS may be a different species to the 
organism from which the APS derives, a variation of this technique involves the 
transformation of host DNA into the same host which has had its APS-producing ability « 
disrupted by mutagenesis. In this method, an APS-producing organism is mutated and non- 
APS producing mutants isolated, and these are complemented by cloned genomic DNA 
from the APS producing parent strain. A further example of a standard technique used to 
clone genes required for APS biosynthesis is the use of transposon mutagenesis to 
generate mutants of an APS-producing organism which, after mutagenesis, fail to produce 
the APS. Thus, ttie region of the host genome responsible for APS production is tagged by 
the transposon and can be easily recovered and used as a probe to Isolate the native 
genes from the parent strain. APS biosynthetic genes which are required for the synthesis 
of APSs and which are similar to Icnown APS compounds may be donable by virtue of their 
sequence homology to the biosynthetic genes of the known compounds. Techniques 
suitable for doning by homology indude standard library screening by DNA hybridization. 

This invention also describes a novel tedinique for the isolation of APS biosynthetic genes 
which may be used to clone ttie genes for any APS, and is particulariy useful for the doning 
of APS biosynthetic genes which may be recalcitrant to doning using any of ttie above 
tediniques. One reason why such recalcitrance to doning may exist is that the standard 
techniques described above (except for cloning by homology) may preferentially lead to tiie 
isolation of regulators of APS biosynthesis. Once such a regulator has been kientified, 
however, it can t>e used using this novel metiiod to isolate the biosynthetic genes under the 
control of the cloned regulator, in tiiis method, a library of transposon insertion mutants is 
cheated in a strain of microorganism which lacks tiie regulator or has had the regulator gene 
disabled by conventional gene disruption techniques. The insertion transposon used 
canies a promoter-less reporter gene {e.g. lacZi. Once the insertion library has been made, 
a functional copy of the regulator gene is transferred to the library of cells {e.g. by 
conjugation or electroporation) and tiie plated cells are selected for expression of the 
reporter gene. Cells are assayed before and after transfer of the regulator gene. Colonies 
which express ttie reporter gene only in ttie presence of ttie regulator gene are insertions 
adjacent to the promoter of genes regulated by the regulator. Assuming the regulator is 
specific in its regulation for APS-biosynthetic genes, then tiie genes tagged by this 
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procedure will be APS-biosynthetic genes. In a prefen^ed embodiment the cloned regulator 
gene is the gafA gene described in PCT application WO 94/01561 which regulates the 
expression of the biosynthetic genes for pyrrolnitrin. Thus, this method is a preferred 
method for the cloning of the biosynthetic genes for pyrrolnitrin. 

An alternative method for identifying and isolating a gene from a microorganism required for 
the biosynthesis of an antipathogenic substance (APS), wherein the expression of said 
gene is under the control of a regulator of the biosynthesis of said APS, comprises 

(a) cloning a library of genetic fragments from said microoipanism into a vector adjacent to 
a promoteriess reporter gene in a vector such that expression of said reporter gene can 
occur only if promoter function is provided by the cloned fragment; 

(b) transforming the vectors generated from step (a) into a suitable host; 

(c) identifying those transformants from step (b) which express said reporter gene only in 
the presence of said regulator; and 

(d) identifying and isolating the DNA fragment operabiy iini<ed to the genetic fragment from 
said microorganism present in the transformants identified In step (c); 

wherein the DNA fragment isolated and Identified in step (d) encodes one or more 
polypeptides required for the biosynthesis of said APS. 

In order for the cloned APS genes to be of use in tnainsgenic expression, it is important that 
all the genes required for synthesis from a particular metabolite be identified and cloned. 
Using combinations of, or all the tediniques described above, this is possible for any known 
APS. As most APS biosynthetic genes are clustered together in microorganisms, usually 
encoded by a single operon, the identification of all the genes will be possible from the 
identification of a single locus in an APS-produdng microorganism. In addition, as 
regulators of APS biosynthetic genes are believed to regulate the whole pathway, then the 
dbning of the biosynthetic genes via their regulators is a particulariy attractive method of 
cloning these genes. In many cases the regulator will control transcription of the single 
entire operon, thus facilitating the cloning of genes using this strategy. 

Using the methods described in this application, biosynthetic genes for any APS can t>e 
cloned from a microorganism. Expression vectors comprising isolated DNA molecules 
encoding one or more polypeptides for the biosynthesis of an antipathogenic substance 
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such as pyrrolnitrin and soraphen can be used to transfonn a heterolgous host. Suitable 
heterologous hosts are bacteria, fungi, yeast and plants. In a preferred embodiment of the 
invention the transformed hosts will be able to synthesize an antipathogenic substance not 
naturally occuring In said host. The host can then be grown under conditions which allow 
production of said antipathogenic sequence, which can be thus be collected from the host 
Using the methods of gene manipulation and transgenic plant production described in this 
specification, the doned APS biosynthetic genes can be modified and expressed in 
transgenic plants. Suitable APS biosynthetic genes include those described at the 
beginning of this section, viz. aminoglycosides and other carbohydrate containing antibiotics 
(e.g. streptomycin), peptide antibiotics (botfi non-ribosomally and rfl^osomally syntiiesized 
types), nucleoside derivatives and other heterocyclic antibiotics containing nitrogen and/or 
oxygen {e.g. polyoxins. nikkomycins, phenazines, and pyrrolnitrin). polyketides, macrocydic 
lactones and quinones {e.g. soraphen, erythromycin and tetracycline). Expression in 
transgenic plants will be under the control of an appropriate promoter and involves 
appropriate cellular targeting considering ttie likely precursors required for ttie particular 
APS under consideration. Whereas ttie invention is intended to indude ttie expression in 
transgenic plants of any APS gene isolatable by ttie procedures described in ttiis 
spedfication, those which are partioilarty preferred indude pyrrolnitrin, soraphen, 
phenazine, and ttie peptide antibiotics gramiddin and epidermin. The doned biosynthetic 
genes can also be expressed in soil-bome or plant colonizing organisms for the purpose of 
conferring and enhandng biocontrol efficacy in these organisms. Particulariy prefen^d APS 
genes for this puqpose are those which encode pyrrolnitrin, soraphen, phenazine, and the 
peptide antibiotics. 

Production of Antipathogenic Substances in Heterologous Microbial Hosts 

Cloned APS genes can be expressed in heterologous bacterial or fungal hosts to enable 
tiie production of the APS with greater effidency than might be possible from native hosts. 
Techniques for these genetic manipulations are spedfic for the different available hosts and 
are known in the art. For example, the expression vectors pKK223-3 and pKK223-2 can be 
used to express heterologous genes in £. coli, eittier in transoiptional or translational 
fusion, behind the tac or trc promoter. For ttie expression of operons encoding multiple 
ORFs, the simplest procedure is to insert the operon into a vector such as pKK223-3 in 
transcriptional fusion, allowing the cognate ribosome binding site of the heterologous genes 
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to be used. Techniques for overexpression in gram-positive species such as Bacillus are 
also known in the art and can be used in the context of this invention (Quax et al. In.: 
Industrial Microorganisms: Basic and Applied Molecular Genetics, Eds. Baltz et ai, 
American Society for Microbiology, Washington (1993)). Altemate systems for 
overexpression rely on yeast vectors and include the use of Pichia, Saccharomyces and 
Kluyveromyces (Sreekrishna, In: Industrial microorganisms: basic and applied molecular 
genetics, Baltz, Hegeman, and Skatrud eds., American Society for Microbiology, 
Washington (1993); Dequin & Ban-e, Biotechnology 12:173-177 (1994); van den Berg etaL 
Biotechnology 8:135-139 (1990)). 

Cloned APS genes can also be expressed in heterologous bacterial and fungal hosts with 
the aim of increasing the efficacy of biocontrol strains of such bacterial and fungal hosts. 
Thus, a method for protecting plants against phytopathogens is to treat said plant with a 
biocontrol agent transformed with one or more vectors collectively capable of expressing all 
of the polypeptides necessary to produce an anti-pathogenic substance in amounts which 
inhibit said phythopathogen. Microorganisms which are suitable for the heterologous 
overexpression of APS genes are all microorganisms which are capable of colonizing plants 
or the rhizosphere. As such they will be brought Into contact with phytopathogenic fungi, 
bacteria and nematodes causing an inhibition of their growth. These include gram-negative 
microorganisms such as Pseudomonas, Enterobacter and Serratia, the gram-positive 
microorganism Bacillus and the fungi Trichoderma and Gliocladium. Particularly preferred 
heterologous hosts are Pseudomonas fiuorescens, Pseudomonas putida, Pseudomonas 
cepacia, Pseudomonas aureofadens, Pseudomonas aurantiaca, Enterobacter cloacae, 
Senatia marscesens. Bacillus subtilis, Badllus cereus, Trichoderma yiride, Trichodemia 
harzianum and Gliocladium virens. In preferred embodiments of the invention the 
biosynthetic genes for pyrrolnitrin, soraphen, phenazine, and/or peptide antibiotics are 
transferred to the particularly preferred heterologous hosts listed above, in a particulariy 
preferred embodiment the biosynthetic genes for phenazine and/or sonaphen are 
transferred to and expressed in Pseudomonas fluorescenssirain CGA267356 (described in 
the published application EP 0 472 494) which has biocontrol utility due to its production of 
pyrrolnitrin (but not phenazine). In another prefemed embodiment, the biosynthetic genes 
for pyaolnitrin and/or soraphen are transfenred to Pseudomonas aureofaciens strain 30-84 
which has biocontrol characteristics due to its production of phenazine. Expression in 
heterologous biocontrol strains requires the selection of vectors appropriate for replication in 
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the chosen host and a suitable choice of promoter. Techniques are well known in the art for 
expression in gram-negative and gram-positive bacteria and fungi and are described 
elsewhere in this specification. 

Expression of Genes for Antl-phytopathogenic Substances in Plants 
A method for protecting plants against phytopathogens is to transform said plant with one or 
more vectors collectively capable of expressing all of the polypeptides necessary to produce 
an anti-pathogenic substance in said plant in amounts which inhibit said phythopathogen. 
The APS biosynthetic genes of this invention when expressed in transgenic plants cause 
the biosynthesis of the selected APS in the transgenic plants. In this way transgenic plants 
with enhanced resistance to phytopathogenic fungi, bacteria and nematodes are generated. 
For their expression in transgenic plants, the APS genes and adjacent sequences may 
require modification and optimization. 

Although in many cases genes from microbial organisms can be expressed in plants at high 
levels without modification, low expression in transgenic plants may result from APS genes 
having codons which are not preferred in plants. It is known in the art that all organisms 
have specific preferences for codon usage, and the APS gene codons can be changed to 
conform with plant preferences, while maintaining the amino adds encoded. Furthermore, 
high expression in plants is best achieved from coding sequences which have at least 35% 
GC content, and preferably more than 45%. Microbial genes which have low GC contents 
may express pooriy in plants due to the existence of ATTTA motifs which may destabilize 
messages, and AATAAA motifs which may cause inappropriate polyadenyiation. In 
addition, potential APS biosynthetic genes can be screened for the existence of illegitimate 
splice sites which may cause message truncation. All changes required to be made within 
the APS coding sequence such as those described above can be made using well known 
techniques of site directed mutagenesis, PGR, and synthetic gene construction u^ng the 
methods described in the published patent applications EP 0 385 962 (to Monsanto), EP 0 
359 472 (to Lubrizol), and WO 93/07278 (to Ciba-Geigy). The preferred APS biosynthetic 
genes may be unmodified genes, should these be expressed at high levels in target 
transgenic plant species, or alternatively may be genes modified by the removal of 
destabilization and inappropriate polyadenyiation motifs and illegitimate splice sites, and 
further modified by the incorporation of plant preferred codons, and further vnth a GO 
content preferred for expression in plants. Although preferred gene sequences may be 
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adequately expressed in both monocotyledonous and dicotyledonous plant species, 
sequences can be modified to account for the specific codon preferences and GC content 
preferences of monocotyledons or dicotyledons as these preferences have been shown to 
differ (Murray etal. NucL Acids Res. IJ: 477-498 (1989)). 

For efficient initiation of translation, sequences adjacent to the initiating methionine may 
require modification. The sequences cognate to the selected APS genes may initiate 
translation efficiently in plants, or alternatively may do so inefficiently. In the case that they 
do so inefficiently, they can be modified by the inclusion of sequences known to be effective 
in plants. Joshi has suggested an appropriate consensus for plants (NAR 15: 6643-6653 
(1987) ; SEQ ID N0:8)) and Clontech suggests a further consensus translation initiator 
(1993/1994 catalog, page 210; SEQ ID N0:7). These consensuses are suitable for use 
with the APS biosynthetic genes of this invention. The sequences are incorporated into the 
APS gene construction, up to and including the ATG (whilst leaving the second amino acid 
of the APS gene unmodified), or alternatively up to and including the GTC subsequent to 
the ATG (with the possibiKty of modifying the second amino add of the transgene). 

Expression of APS genes in transgenic plants is behind a promoter shown to be functional 
in plants. The choice of promoter will vary depending on the temporal and spatial 
requirements for expression, and also depending on the target species. For the protection 
of plants against foliar patfiogens, expression in leaves is prefemed; for the protection of 
plants against ear pathogens, expression in inflorescences (e.^. spikes, panides* cobs etc.) 
is preferred; for protection of plants against root pathogens, expression in roots is prefen^d; 
for protection of seedlings against soil-bome pathogens, expression in roots and/or 
seedlings is preferred. In many cases, however, expresston against more than one type of 
phytopathogen will be sought, and thus expression in multiple tissues will be desirable. 
Although many promoters from dicotyledons have been shown to be operational in 
monocotyledons and vice versa, ideally dicotyledonous promoters are selected for 
expression in dicotyledons, and monocotyledonous promoters for expression in 
monocotyledons. However, there is no restriction to the provenance of selected promoters; 
it is suffident that they are operational in driving the expression of the APS biosynthetic 
genes. In some cases, esqsression of APSs in plants may provide protection against insect 
pests. Transgenic expression of the biosynthetic genes for the APS beauveridn (isolated 
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from Beauveria bassiana) may, for example provide protection against insect pests of crop 
plants. 

Preferred promoters which are expressed constitutively include the CaMV 35S and 19S 
promoters, and promoters from genes encoding actin or ubiquitin. Further prefenred 
constitutive promoters are those from the T2(4-28), CP21, CP24. CP38p and CP29 genes 
whose cDNAs are provided by this invention. 

The APS genes of this invention can also be expressed under the regulation of promoters 
which are chemically regulated. This enables the APS to be synthesized only when the 
crop plants are treated with the inducing chemicals, and APS biosynthesis subsequently 
declines. Prefen-ed technology for chemical induction of gene expression is detailed in the 
published European patent application EP 0 332 104 (to Ciba-Geigy) herein Incorporated by 
reference. A prefenred promoter for chemical induction is the tobacco PR-1 a promoter. 

A prefenred category of promoters is that which is wound inducible. Numerous promoters 
have been described which are expressed at wound sites and also at the sites of 
phytopathogen infection. These are suitable for the expression of APS genes because 
APS biosynthesis is turned on by phytopathogen infection and thus the APS only 
accumulates when infection occurs. Ideally, such a promoter should only be active locally 
at the sites of infection, and in this way APS only accumulates in cells which need to 
synthesize the APS to kill the invading phytopathogen. Preferred promoters of this land 
include those described by Stanford et al. Mol. Gen. Genet £15: 200-208 (1989). Xu et al 
Plant IMoiec. Biol. 22: 573-588 (1993), Logemann et al. Plant Cell 1: 151-158 (1989), 
Rohmieier & Lehle, Plant Molec. Biol. 22: 783-792 (1993), Firek etal Plant Molec. Btol. 22: 
129-142 (1993), and Warner etal Plant J. 1: 191-201 (1993). 

Preferred tissue specific expression pattems include green tissue spedfic, root specific, 
stem specific, and flower specific. Promoters suitable for expression in green tissue include 
many which regulate genes involved in photosyntiiesis and many of these have been 
cloned from both monocotyledons and dicotyledons. A preferred promoter is the maize 
PEPC promoter from the phosphoenol carboxylase gene (Hudspeth & Grula, Plant Molec: 
Biol. 12: 579-589 (1989)). A preferred promoter for root specific expres^on is that 
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described by de Framond (FEBS 290: 103-106 (1991); EP 0 452 269 to Ciba-Geigy) and a 
further preferred root-specific promoter is that from the T-1 gene provided by this invention. 
A preferred stem specific promoter is tiial described in patent application WO 93/07278 (to 
Ciba-Geigy) and which drives expression of the maize trpA gene. 

Prefen-ed embodiments of the invention are transgenic plants expressing APS biosynthetic 
genes in a root-specific fashion. In an especially prefen-ed embodiment of the invention the 
biosynthetic genes for pyrrolnitrin are expressed behind a root specific promoter to protect 
transgenic plants against the phytopathogen Rhizoctonia. In another especially preferred 
embodiment of the invention the biosynthetic genes for phenazine are expressed behind a 
root specific promoter to protect transgenic plants against the phytopathogen 
Gaeumannomyces graminis. Further preferred embodiments are transgenic plants 
expressing APS biosynthetic genes in a wound-inducible or pathogen infection-inductbie 
manner. For example, a further especially preferred embodiment involves the expression of 
the biosynthetic genes for soraphen behind a wound*indudbie or pathogen-indudble 
promoter for the control of foliar pathogens. 

In addition to the selection of a suitable promoter, constructions for APS expression in 
plants require an appropriate transcription terminator to be attached downstream of the 
heterologous APS gene. Several sudi temiinators are available and known in the art (e.p. 
tml from CaMV, E9 from rbcS). Any available tenninator known to function in plants can be 
used in the context of this invention. 

Numerous other sequences can be incorporated into expression cassettes for APS genes. 
These include sequences which have been shown to enhance expression such as intron 
sequences {e.g. from Adhi and bronzel) and viral leader sequences {e.g. from TMV» 
MCMVandAMV). 

The overproduction of APSs in plants requires that the APS biosynthetic gene encoding the 
first step in the pathway will have access to the pathway substrate. For each individual APS 
and pathway involved, this substrate will likely differ, and so too may its cellular localization 
in the plant In many cases the substrate may be localized in the cytosol. whereas in ottier 
cases it may be localized in some subcellular organelle. As much biosynthetic activity in the 
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plant occurs in the chloroplast, often the substrate may be localized to the chloroplast and 
consequently the APS biosynthetic gene products for such a pathway are best targeted to 
the appropriate organelle {e.g. the chloroplast). Subcellular localization of transgene 
encoded enzymes can be undertaken using techniques well known in the art. Typically, the 
ONA encoding the target peptide from a known organelle-targeted gene product is 
manipulated and fused upstream of the required APS gene/s. Many such target sequences 
are known for the chloroplast and their functioning in heterologous constructions has been 
shown. In a preferred embodiment of this invention the genes for pyrrolnitrin biosynthesis 
are targeted to the chloroplast because the pathway substrate tryptophan is synthesized in 
the chloroplast. 

in some situations* the overexpresslon of APS genes may deplete the cellular availability of 
the substrate for a particular pathway and this may have detrimental effects on the cell. In 
situations such as this it is desirable to increase the amount of substrate available by the 
overexpresslon of genes whidi encode the enzymes for the biosynthesis of the substrate. 
In the case of tryptophan (the substrate for pynt>lnitrin biosynthesis) this can be achieved by 
overexpressing the trpA and trpB genes as well as anthranilate synthase subunits. 
Similarly, overexpresslon of ttie enzymes for chorismate biosynthesis such as DAMP 
synthase will be effective in producing the precursor required for phenazine production. A 
further way of making more substrate available is by the turning off of known pathways 
which utilize specific substrates (provided this can be done without detrimental side effects). 
Iri this manner, the substrate synthesized is channeled towards the biosynthesis of the APS 
and not towards other compounds. 

Vectors suitable for plant transfonnation are described elsewhere in this spedfication. For 
Agrobacterium-mediated transfonnation, binary vectors or vectors carrying at least one T- 
DNA border sequence are suitable, whereas for direct gene transfer any vector is suitable 
and linear DNA containing only ttie construction of interest may be prefenred. In the case of 
direct gene transfer, transfomiation with a single DNA spedes or co*transformation can be 
used (Schocher et ai Biotechnology 4: 1093-1096 (1986)). For both direct gene transfer 
and >4probacfe//t//n-medtated transfer, transfomiation is usually (but not necessarily) 
undertaken with a selectable maricer which may provide resistance to an antibiotic 
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(kanamycin, hygromycin or methatrexate) or a herbicide (basta). The choice of selectable 
marker is not» however, critical to the invention. 

Synthesis of an APS in a transgenic plant will frequently require the simultaneous 
overexpression of multiple genes encoding the APS biosynthetic enzymes. This can be 
achieved by transforming the individual APS biosynthetic genes into different plant lines 
individually, and then crossing the resultant lines. Selection and maintenance of lines 
carrying multiple genes is facilitated if each the various transformation constructions utilize 
different selectable markers. A line in which all the required APS biosynthetic genes have 
been pyramided will synthesize the APS, whereas other lines will not. This approach may 
be suitable for hybrid crops such as maize in which the final hybrid is necessarily a oross 
between two parents. The maintenance of different inbred lines with different APS genes 
may also be advantageous in situations where a particular APS pathway may lead to 
multiple APS products, each of which has a utility. By utilizing different lines canrying 
different alternative genes for later steps in the pathway to make a hybrid cross with lines 
carrying ail the remaining required genes it is possble to generate different h^rids canrying 
different selected APSs which may have different utilities. 

Alternate methods of producing plant lines carrying multiple genes Include the 
retransf omnation of existing lines already transf omied witii an APS gene or APS genes (and 
selection witti a different maricer), and also the use of single transfomnation vectors which 
carry multiple APS genes* each under appropriate regulatory control {Le. promoter, 
temilnator etc.). Given the ease of DNA constniction, the manipulation of cloning vectors to 
carry multiple APS genes is a preferred method. 

Before plant propagation material (fmit, tuber, grains, seed) and expeciaiiy before seed is 
sold as a commerical product, it is customarily treated witti a protectant coating oonqsrising 
hertDicides, insecticides, fungicides, bactericides, nematiddes, mollusdcides or mixtures of 
several of tiiese compounds, if desired these compounds are formulated together writh 
furtiier earners, surfactants or application-promoting adjuvants customarily employed In the 
art of fomiulation to provide protection against damage caused by bacterial, fungal or 
animal pests. 

In order to treat the seed, the protectant coating may be applied to ttie seeds either by 
impregnating the tubers or grains with a liquid fomiulation or by coating tiiem witii a 
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combined wet or dry formulation. In special cases other methods of application to plants are 
possible such as treatment directed at the buds or the fnilt. 

A plant seed according to the invention comprises a DNA sequence encoding for the 
production of an antipathogenic substance and may be treated with a seed protectant 
coating comprising a seed treatment compound such as captan, carboxin, thiram (TMTD®), 
methalaxyl (Apron®), pirimiphos-methyl (Actellic®) and others that are commonly used in 
seed treatment. It is thus a further object of the present invention to provide plant 
propagation material and especially seed encoding for the production of an antipathogenic 
substance, which material is treated with a seed protectant coating customarily used in 
seed treatment. 

Production of Antipathogenic Substances in Heterologous Hosts 

The present invention also provides methods for obtaining APSs. These APSs may be 
effective in the inhibition of growth of microbes, particularly phytopathogenic microbes. The 
APSs can be produced in large quantities from organisms in which the APS genes have 
been overexpressed, and suitable organisms for this include gram-negative and gram- 
positive bacteria and yeast as well as plants. For the purposes of APS production, the 
significant criteria in tiie choice of host organism are its ease of manipulation, rapidity of 
growth (/.e. fenmentation in ttie case of microorganisms), and its lack of susceptibility to tiie 
APS being overproduced. In a prefened embodiment of the invention enhanced amounts 
of an antipathogenic substance are synthesized in a host, in which the antipathogenic 
substance naturally occurs, wherein said host is transfonned witii one or more DNA 
molecules collectively encoding the complete set of polypeptides required to syntiiesize 
said antipathogenic substance. These methods of APS production have significant 
advantages over the chemical syntiiesis technology usually used in the preparation of APSs 
such as antibiotics. These advantages are tiie cheaper cost of production, and ttie ability to 
syntiiesize compounds of a preferred biological enantiomer, as opposed to the racemic 
mixtures inevit^ly generated by organic synthesis. The ability to produce stereochemically 
appropriate compounds is particularly important for molecules with many chirally active 
carbon atoms. APSs produced by heterologous hosts can be used in medical {Le. control 
of pathogens and/or infectious disease) as well as agricultural applications. 
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Formulation of Antipathogenlc Compositions 

The present invention further embraces the preparation of antifungal compositions In which 
the active ingredient is the antibiotic substance produced by the recombinant biocontrol 
agent of the present invention or alternatively a suspension or concentrate of the 
microorganism. The active ingredient is homogeneously mixed with one or more 
compounds or groups of compounds described herein. The present invention also relates 
to methods of protecting plants against a phytopathogen, which comprise application of the 
active ingredient, or antifungal compositions containing the active ingredient, to plants in 
amounts which inhibit said phytopathogen. 

The active ingredients of the present invention are normally applied in the form of 
compositions and can be applied to the crop area or plant to be treated, simultaneously or 
in succession, with further compounds. These compounds can be both fertilizers or 
micronutrient donors or other preparations that influence plant growth. They can also be 
selective herbicides, insecticides, fungicides, bactericides, nematicides, mollusicides or 
mixtures of several of these preparations, if desired together with further carriers, 
surfactants or application-promoting adjuvants customarily employed in the art of 
formulation. Suitable earners and adjuvants can be solid or liquid and correspond to the 
substances ordinarily employed in formulation technology, e.g. natural or regenerated 
niinerai substances, solvents, dispersants, wetting agents, tackifiers, binders or fertilizers. 

A preferred method of applying active ingredients of the present invention or an 
agrochemical composition which contains at least one of the active ingredients is leaf 
application. The number of applications and the rate of application depend on the intensity 
of infestation by the corresponding phytopathogen (type of fungus). However, the active 
ingredients can also penetrate the plant through the roots via the soil (systemic action) k>y 
impregnating the locus of the plant mth a liquid composition, or by applying the compounds 
in solid fonm to the soil, e.g. in granular fomn (soil application). The active ingredients may 
also be applied to seeds (coating) by impregnating the seeds either with a liquid fonnulation 
containing active ingredients, or coating them witti a solid fonnulation. In special cases, 
furttier types of application are also possbie, for example, selective treatment of tiie plant 
stems or buds. 
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The active ingredients are used in unmodified form or, preferably, together with the 
adjuvants conventionally employed in the art of fonnulation, and are therefore formulated in 
known manner to emuisifiable concentrates, coatable pastes, directly sprayable or dilutable 
solutions, dilute emulsions, wettable powders, soluble powders, dusts, granulates, and also 
encapsulations, for example, in polymer substances. Like the nature of the compositions, 
the methods of application, such as spraying, atomizing, dusting, scattering or pouring, are 
chosen in accordance with the intended objectives and the prevailing circumstances. 
Advantageous rates of application are normally from 50 g to 5 kg of active Ingredient (a J.) 
per hectare, preferably from 100 g to 2 kg a.i7ha, most preferably from 200 g to 500 g 
aJ./ha. 

The fomnulations, compositions or preparations containing the active ingredients and« where 
appropriate, a solid or liquid adjuvant, are prepared in known manner, for example by 
homogeneously mixing and/or grinding the active ingredients with extenders, for example 
solvents, solid can^iers and, where expropriate, suri[ace*active compounds (surfactants). 

Suitable solvents include aromatic hydrocarbons, preferably the fractions having 8 to 12 
cartsoh atoms, for example, xylene mixtures or substituted naphthalenes, phthalates such 
as dibutyl phthalate or dioctyl phthalate. aliphatic hydrocart)ons such as (^clohexane or 
paraffins, alcohols and glycols and their etiiers and esters, such as ethanol, ethylene glycol 
monomethyl or monoethyl etiier, ketones such as ^dohexanone, strongly polar solvents 
such as N-methyl-2-pynrolidone, dimethyl sulfoxkie or dimethyl formamide, as well as 
epoxMized vegetable oils such as epoxidized coconut oil or soybean oil; or water. 

The solid carriers used e.g. for dusts and dispersible powders, are normally natural mineral 
fillers such as calcite, talcum, kaolin, montmorilionite or attapulgite. in order to improve the 
physical properties it is also possible to add h^hly dispersed silicic acid or highly dispersed 
absori^ent polymers. Suitable granulated adsoiptive caniers are porous types, for example 
pumice, broken brick, septolite or bentonite; and suitable nonsori3ent carriers are materials 
such as caicite or sand. In addition, a great number of pregranulated materials of inorganic 
or organic nature can be used, e.g. especially dolomite or pulverized plant residues. 
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Depending on the nature of the active ingredient to be used in the formulation, suitable 
surface-active compounds are nonionic, cationic and/or anionic surfactants having good 
emulsifying, dispersing and wetting properties. The term "surfactants** will also be 
understood as comprising mixtures of surfactants. 

Suitable anionic surfactants can be both water-soluble soaps and water-soluble synthetic 
surface-active compounds. 

Suitable soaps are the alkali metal salts, alkaline earth metal salts or unsubstituted or 
substituted ammonium salts of higher fatty acids (chauns of 10 to 22 carbon atoms), for 
example ttie sodium or potassium salts of oleic or stearic acid, or of natural fatty acid 
mixtures which can be obtained for example from cx>conut oil or tallow oil. The fatty add 
methyltaurin salts may also be used. 

More frequently, however, so-called synthetic surfactants are used, espedally fatty 
sulfonates, fatty sulfates, sulfonated benzimidazole derivatives or alkylarylsulfonates. 

The fatty sulfonates or sulfates are usually in the form of alkali metal salts, alkaline earth 
metal salts or unsubstituted or substituted ammoniums salts and have a 8 to 22 carbon alkyl 
radical which also indudes the alkyl moiety of all^l radicals, for example* the sodium or 
caldum salt of iignonsutfonic add, of dodecylsulfate or of a mixture of fatty alcohol sulfates 
obtained from natural fat^ adds. These compounds also comprise the salts of sulfuric add 
esters and sulfonic adds of fatty alcohol/ethylene oxkie adducts. The sulfonated 
benzimidazole derivatives preferably contain 2 sulfonic add groups and one fatty add 
radical containing 8 to 22 carbon atoms. Examples of ali^arylsulfonates are the sodium, 
caldum or triethanolamlne salts of dodecylbenzenesutfonic add, dibutylnapthalenesulfonic 
add, or of a naphthalenesulfonic add/fonnaldehyde condensation product. Also suitable 
are comesponding phosphates, e.g. salts of the phosphoric add ester of an adduct of p- 
nonylphenol with 4 to 14 moles of ethylene oxide. 

Non-Ionic surfactants are preferably polyglycol etiier derivatives of aliphatic or cydoaliphatic 
alcohols, or saturated or unsaturated fatty adds and all^henols, said derivatives 
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containing 3 to 30 glycol ether groups and 8 to 20 carbon atoms in the (aliphatic) 
hydrocarbon moiety and 6 to 18 cart)on atoms in the alkyi moiety of the alkylphenols. 

Further suitable non-ionic surfactants are the water-soluble adducts of polyethylene oxide 
with polypropylene glycol, ethylenediamine propylene glycol and alkylpolypropylene glycol 
containing 1 to 10 carbon atoms in the alkyI chain, which adducts contain 20 to 250 
ethylene glycol ether groups and 10 to 100 propylene glycol ether groups. These 
compounds usually contain 1 to 5 ethylene glycol units per propylene glycol unit. 

Representative examples of non-ionic surfactants are nonylphenolpolyethoxyethanols, 
castor oil polyglycol ethers. polypropylene/polyethylene oxide adducts, 
tributylphenoxypolyethoxyethanol. polyethylene glycol and octylphenoxyethoxyethanol. 
Fatty add esters of poiyoxyethylene sorbitan and potyoxyethyiene sorbitan trioieate are also 
suitable non-ionic surf actants. 

Cationic surfactants are preferably quatemary ammonium salts which have, as N- 
substituent. at least one C8-C22 all^l radical and, as further substituents, iower 
unsubstituted or halogenated aii^l, benzyl or lower hydroxyallcyl radicals. The salts are 
preferably in the form of halides, methylsulfates or ethylsulfates, e.g. 
stearyltrimethylammonium chloride or benzyldi(2-chloroethyl)ethylammonium bromide. 

The surfactants customarily employed In the art of formulation are described, for example, 
in "McCutcheon*s Detergents and Emulsifiers Annual/ MC PubUshing Corp. Ringwood, New 
Jersey. 1979. and Stsely and Wood. "Encyclopedia of Surface Active Agents," Chemical 
Publishing Co., Inc. New Yoric. 1980. 

The agrochemical compositions usually contain from about 0.1 to about 99 %, preferably 
about 0.1 to about 95 %, and most preferably from about 3 to about 90 % of the active 
ingredient, from about 1 to about 99.9 %, preferably from abut 1 to about 99 %, and most 
preferably from about 5 to about 95 % of a solid or liquid adjuvant, and from about 0 to 
about 25 %. preferably about 0.1 to about 25 %, and most preferably from about 0.1 to 
about 20 % of a surfactant. 
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Whereas commercial products are preferably formulated as concentrates, the end user will 
nomnally employ dilute formulations. 

EXAMPLES 

The following examples serve as further desertion of the Invention and methods for 
practicing the invention. They are not intended as being limiting, rather as providing 
guidelines on how the invention may be practiced. 

A. Identification of Microorganisms which Produce Antlpathoaentc Substances 
Microorganisms can be isolated from many sources and screened for their ability to inhibit 
fungal or bacterial growth In vttro. Typically the microorganisms are diluted and plated on 
medium onto or into which fungal spores or mycelial fragments, or bacteria have been or 
are to be introduced. Thus, zones of clearing around a newly isolated bacterial colony are 
indicative of antipathogenlc activity. 

Example 1 : Isolation of MicrooFganisms with AxtlO-Rhlzoctonla Properties from Soil 
A gram of soil (containing approximately 10^-10^ bacteria) is suspended in 10 ml sterile 
water. After vigorously mixing, the soil particles are allowed to settle. Appropriate dilutions 
are made and aliquots are plated on nutrient agar plates (or other growth medium as 
appropriate) to obtain 50-100 colonies per plate. Freshly cultured Rhbsoctonia mycelia are 
fragmented by blending and su^nsions of fungal fragments are sprayed on to the agar 
plates after the bacterial colonies have grown to be just visible. Bacterial isolates with 
antifungal activities can be identified by ttie fungus-free zones surrounding them upon 
further incubation of the plates. 

The production of bioactive metabolites by such isolates is oonfinned by the use of culture 
filtrates in place of live colonies in the plate assay described above. Such bioassays can 
also be used for monitoring the purification of the metabolites. Purification may start witti an 
organic solvent extraction step and depending on whetiier tiie active prindple is extracted 
into ttie organic phase or left in ttie aqueous phase, different chromatographic steps follow. 
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These chromatographic steps are well known In the art. Ultimately, purity and chemical 
identity are determined using spectroscopic methods. 

B. Cloning Anttpathoaenic Biosvnthetic Genes from Microorganisms 

Exarriple 2: Shotgun Cloning Antipathogenic Biosynthetic Genes from their Native 
Source 

Related biosynthetic genes are typically located in dose proximity to each other in 
microorganisms and more than one open reading frame is often encoded by a single 
operon. Consequently, one approach to the cloning of genes which encode enzymes in a 
single biosynthetic pathway is the transfer of genome fragments from a microorganism 
containing said pathway to one which does not, with subsequent screening for a phenotype 
confenred by the pathway. 

In the case of biosynthetic genes encoding enzymes leading to the production of an 
antipathogenic substance (APS), genomic DNA of the antipathogenic substance producing 
microorganism is isolated, digested with a restriction endonuclease such as Sau3A, size 
fractionated for the isolation of fragments of a selected size (the selected size depends on 
the vector being used), and fragments of the selected size are cloned into a vector {e.g. the 
BamHI site of a cosmid vector) for transfer to E. coll The resulting E. co// clones are then 
screened for those which are producing the antipathogenic substance. Such screens may 
be based on the direct detection of the antipathogenic substance, such as a biochemical 
assay. 

Alternatively, such screens may be based on the adverse effect assodated with the 
antipathogenic substance upon a target pathogen, in these screens, the clones producing 
the antipathogenic substance are selected for their ability to kill or retard the growth of the 
target pathogen. Such an inhibitory activity forms the basis for standard screening assays 
well known in the art, such as screening for the ability to produce zones of clearing on a 
bacterial plate impregnated with the target pathogen (eg. spores where the target pathogen 
is a fungus, cells where the target pathogen is a bacterium). Clones selected for their 
antipathogenic activity can then be further analyzed to confinm the presence of the 
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antipathogenic substance using the standard chemical and biochemical techniques 
appropriate for the particular antipathogenic substance. 

Further characterization and identification of the genes encoding the biosynthetic enzymes 
for the antipathogenic substance is achieved as follows. DNA inserts from positively 
identified E coli clones are isolated and further digested into smaller fragments. The 
smaller fragments are then recloned into vectors and reinserted into E co//with subsequent 
reassaying for the antipathogenic phenotype. Alternatively, positively identified dones can 
be subjected to X::Tn5 transposon mutagenesis using techniques well known in the art {e.g. 
de Bruijn & Lupskl, Gene 27: 131*149 (1984)). Using this method a number of disruptive 
transposon insertions are introduced into the DNA shown to confer APS production to 
enable a delineation of the precise region/s of the DNA which are responsible for APS 
production. Subsequently, determination of the sequence of the smallest insert found to 
confer antipathogenic substance production on E. co// will reveal the open reading frames 
required for APS production. These open reading frames can ultimately be dismpted (see 
below) to confirm their role In the biosynthesis of the antipathogenic substance. 

Various host organisms such as Bacillus and yeast may be substituted for E coli in the 
techniques described using suitable cloning vectors known in the art for such host The 
choice of host organism has only one limitation; it should not be sensitive to the 
antipathogenic substance for v^ich the biosynthetic genes are being cloned. 

Example 3: Cloning Biosynthetic Genes for an Antipathogenic Substance using 
Transposon Mutagenesis 

In many microorganisms which are known to produce antipatiiogenic substances. 

transposon mutagenesis is a routine technique used for the generation of insertion mutants. 

This technique has been used successhjily in Pseudomonas (e.g. Lam et al., Plasmld 

13200-204 (1985)), Badllus (e.g. Youngman et al., Proc Natl. Acad. Sd. USA 802305- 

2309 (1983)). Staphylococcus (e.g. Pattee. J. Bacteriol. 145:479^88 (1981)). and 

Streptomyces (e.g. Schauer ef a/., J. Bacteriol. 173:5060-5067 (1991)). among others. The 

main requirement for the technique is ttie ability to introduce a transposon containing 

plasmid into tiie microorganism enabling the transposon to insert itself at a random position 

in tiie genome. A large itbrary of insertion mutants is created by introdudng a transposon 
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canying plasmid into a large number of microorganisms. Introduction of the plasmid into the 
microorganism can be by any appropriate standard technique such as conjugation, direct 
gene transfer techniques such as electroporation. 

Once a transposon library has been created in the manner described above, the transposon 
Insertion mutants are assayed for production of the APS. I^utants which do not produce the 
APS would be expected to predominantly occur as the result of transposon insertion Into 
gene sequences required for APS biosynthesis. These mutants are therefore selected for 
further analysis. 

DNA from the selected mutants which is adjacent to the transposon Insert Is then cloned 
using standard techniques. For Instance, the host DNA adjacent to the transposon Insert 
may be cloned as part of a library of DNA made from the genomic DNA of the selected 
mutant. This adjacent host DNA Is then Identified from the Hbraiy using the transposon as a 
DNA probe. Alternatively, If the transposon used contains a suitable gene for antibiotic 
resistance, then the Insertion mutant DNA can be digested with a restriction endonudease 
which will be predicted not to cleave within this gene sequence or between its sequence 
and the host insertion point, followed by doning of the fragments thus generated into a 
microoiganism such as £. coilf which can then be subjected to selection us'vtg the chosen 
antibiotic. 

Sequencing of the DNA beyond the Inserted transposon reveals the adjacent host 
sequences. The adjacent sequences can in turn be used as a hybridization probe to 
redone the undisaipted native host DNA using a non-mutant host library. The DNA thus 
Isolated from the non-mutant is dtaracterized and used to complement the APS defident 
phenotype of the mutant DNA which complements may contain either APS biosynthetic 
genes or genes which regulate all or part of the APS biosynthetic pathway. To be sure 
isolated sequences encode biosynthetic genes they can be transfemed to a heterologous 
host which does not produce the APS and which is insensitive to the APS (such as E. co/^. 
By transferring smaller and smaller pieces of the isolated DNA and the sequendng of the 
smallest effective piece, the APS genes can be IdentHled, Alternatively, positiveiy identified 
dones can be subjected to X::Tn5 transposon mutagenesis using techniques well Imown in 
the art {e.g. de Bnign & Lupski, Gene 27: 131-149 (1984)). Using this method a number of 
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disruptive transoposon insertions are introduced into the DNA shown to confer APS 
production to enable a delineation of the predse region/s of the DNA which are responsible 
for APS production. These latter steps are undertaken in a manner analagous to that 
described in example 1. In order to avoid the possibility of the cloned genes not being 
expressed in the heterologous host due to the non-functioning of their heterologous 
promoter, the cloned genes can be transferred to an expression vector where they will be 
fused to a promoter known to function in the heterologous host In the case olE. coli an 
example of a suitable expression vector is pKK223 which utilizes the tac promoter. Simitar 
suitable expression vectors also exist for other hosts such as yeast and are well known in 
the art. In general such fusions will be easy to undertake because of the operon-type 
organization of related genes In microorganisms and the likelihood that the biosynthetic 
enzymes required for APS biosynthesis will be encoded on a ^ngle transcript requiring only 
a single promoter fusion. 

Example 4: Cloning Antipathogenic Biosynthetic Genes using Mutagenesis and 
Complementation 

A similar method to that described ^ve involves the use of non-insertion mutagenesis 
techniques (such as chemical mutagenesis and radiatbn mutagenesis) together with 
complementation. The APS producing microorganism is subjected to non-insertion 
mutagenesis and mutants which lose the ability to produce the APS are selected for further 
analysis. A gene library is prepsued from the parent APS-producing strain. One suitable 
E^proach would be the ligation of fragments of 20-30 kb into a vector such as pVKlOO 
(lOiauf et al. Plasmid 8: 45-54 (1982)) into E. coli hari30ring the tra-i- plasmid pRK2013 
which wouM enable the transfer by triparental conjugation back to the selected APS-minus 
mutant (Ditta Bt al. Proc. Matl. Acad. Sd. USA 77: 7247-7351 (1980)). A further suitable 
approach would be the transfer back to the mutant of the genes library via electroporation. 
in each case subsequent selection is for APS production. Selected colonies are further 
characterized by the retransformation of APS-minus mutant with smaller fragments of the 
complementing DNA to identify the smallest successfully complementing fragment which is 
then subjected to sequence analyse. As with example 2, genes isolated by this procedure 
may be biosynthetic genes or genes which regulate the entire or part of the APS 
biosyntfietic pathway. To be sure that the isolated sequences encode biosynthetic genes 
they can be transferred to a heterologous host which does not produce the APS and is 
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insensitive to the APS (such as £ co/.). These latter steps are undertaken In a manner 
analagous to that described In example 2. 

ExamoleS- Cloning Antlpathogenic Blosynthetic Genes by Exploiting Regulators 
Examples. ^Jch Control the Expression of the Blosynthetic Genes 

A further approach In the cloning of APS blosynthetic genes relies on the use of regulators 
which control the expression of these blosynthetic genes. A library of transposon Insertion 
mutants Is created in a strain of microorganism which lacks the regulator or has had the 
regulator gene disabled by conventional gene disniption techniques. The Insertion 
transposon used carries a promoter-less reporter gene (e.g. tecZ). Once the insertion 
library has been made, a functional copy of the regulator gene is transferred to the library of 
cells (e.g. by conjugation or electroporation) and the plated cells are selected for expression 
of the reporter gene. Cells are assayed before and after transfer of the regulator gene. 
Colonies which express the reporter gene only In the presence of the regulator gene are 
insertions adjacent to the promoter of genes regulated by the regulator. Assuming the 
regulator is specific in Its regulation for APS-biosynthetic genes, then the genes tagged by 
this procedure will be APS^>iosynthetic genes. These genes can then be cloned and 
further characterized using the techniques described m example 2. 

Examples: aonlng Antlpathogenic Blosynthetic Genes by Homology 
Stendaixl DNA techniques can be used for the cloning of novel antipathogenic blosynthetic 
genes by virtue of their homology to known genes. A DNA library of the microorganism of 
interest -IS made and then probed with radiolabelled DNA derived from the gene/s for APS 
biosynthesis frem a different organism. The newly Isolated genes are characterized and 
sequenced and introduced Into a heterologous microorganism or a mutent APS-minus 
strain of the native microorganisms to demonstrate their conferral of APS production. 

C ninnino of Pvrrf^mitrin Blosvntb Atin Genes from Pseudomonas 

Pyrrolnitrin is a phenylpyrole compound produced by various strains of Pseudomonas 

fluorescens. P. flao/escens strains which produce pyrrolnitrin are effective biocontrol strains 

against Rhizoctonia and Pythium fungal pathogens (WO 94/01561). The biosynthesis of 

pyrrolnitrin is postulated to start from tryptophan (Chang at at. J. Antibiotics 34: 555-566 

(1981)). 
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Example?: Use of the gafA Regulator Gene for the Isolation of Pyrrolnltrin 
Biosynthetic Genes from Pseucrofflonas 

The gene cluster encoding pyrrolnitrin biosynthetic enzymes was isolated using the basic 
principle described in example 5 above. The regulator gene used In this isolation procedure 
was the gafA gene from Pseudomonas fhjotescens and is ioiown to be part of a two- 
component regulatory system controlling certain biocontrol genes in Pseudomonas. The 
gafA gene is described in detail in WO 94/01561 which is hereby incorporated by reference 
in its entirety. gafA is further described In Gaffney ef al. (Molecular Plant-Microbe 
Interactions 7: 455-463. 1994, also hereby Incorporated In its entirety by reference) where It 
is refen-ed to as "ORfST. The gafA gene has been shown to regulate pyrrolnitrin 
biosynthesis, chitinase, gelatinase and ^anide production. Strains which lask the gafA 
gene or which express the gene at tow levels (and in consequence gafA-regulated genes 
also at low levels) are suitable for use in this isolation technique. 

Example 8: isolation of Pyrrolnitrin Biosynthesis Genes in Pseudomonas 
The transfer of the gafA gene from MOCG 134 to closely related non-pynolnitiin producing 
wild-type strains of Pseudomonas fluorescens results in the abflity of these strains to 
produce pyrrolnitrin. (Gaffney et al.. MPMI (1994)); see also Hill et al. Applied And 
Environmental Microbiology 60 78-85 (1994)). This indicates that these ctosely related 
strsuns have the structural genes needed for pyrrolnitrin biosynthesis but are unable to 
produce the compound without activation from the gafA gene. One such closely related 
strsun, M0CG133. was used for the identificatton of the pynolnitrin triosynthesis genes. The 
transposon TnCIB116 (Jjam, New Dvections in Biological Control: AHematives for 
Suppressing Agricultural Pests and Diseases, pp 767-778, Alan K Uss, Inc. (1990)) was 
used to mutagenize MOCQ133. This transposon, a Tn5 derivative, encodes kanamydn 
resistance and contains a promoteriess lacZ reporter gene near one end. The transposon 
was introduced into MOCG133 by conjugation, using the plasmid vector pCIB116 (Lam, 
New Directions in Biological Control: AHematives for Suppressing Agricultural Pests and 
Diseases, pp 767-778. Alan R. Uss. Inc. (1990)) which can be mobilized into MOCG133, 
but cannot replicate in that organism. Most, if not all. of the ienamycin resistant 
transconjugants were therefore the result of transposition of TnCiB116 into different sites in 
the MOCG133 genome. When the transposon integrates into the bacterial chromosome 
behind an active promoter the lacZ reporter gene is activated. Such gene activation can be 

) 
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monitored visually by using the substrate X-gal. which releases an insoluble blue product 
upon cleavage by the lacZ gene product. Kanamycin resistant transconjugants were 
collected and arrayed on master plates which were then replica plated onto lawns of E coll 
strain S17-1 (Simon et al., Bio/techonology 1:784-791 (1983)) transformed with a plasmid 
carrying the wide host range RK2 origin of replication, a gene for tetracycline selection and 
the gafA gene. £ coli strain SI 7-1 contains chromosomally integrated tra genes for 
conjugal transfer of plasmids. Thus, replica plating of insertion transposon mutants onto a 
lawn of the S17-1/gaM E. co// results in the transfer to the insertion transposon mutants of 
the paM-carrying plasmid and enables the activity of the lacZ gene to be assayed in the 
presence of the gafA regulator (expression of the host gafA is insufficient to cause lacZ 
expression, and introduction of gafA on a multicopy plasmid is more effective). Insertion 
mutants which had a "blue" phenotype (i.e. lacZ activity) only in the presence of gafA were 
identified. In these mutants, the transposon had integrated within genes whose expression 
were regulated by gafA. These mutants (with introduced gafA) were assayed for their 
ability to produce cyanide, chitinase, and pyrrolnitrin (as described in Gaffney et al., 1994 
MPMI, in press) -activities known to be regulated by gafA (Gaffney ef a/., 1994 MPMl, in 
press). One mutant did not produce pyn-olnitrin but did produce cyanide and chitinase, 
indicating that the transposon had inserted in a genetic region involved only in pyrrolnitrin 
biosynthesis. DNA sequences flanking one end of the transposon were cloned by digesting 
chromosomal DNA isolated from the selected insertion mutant with Xhol, ligating the 
fragments derived from this digestion into the X/jo/site of pSP72 (Promega, cat # P2191) 
and selecting the E. coli transfonned witii the products of tills ligation on kanamycin. Ihe 
unique Xhol site wittiin tiie transposon cleaves beyond the gene for kanamycin resistance 
and envied the flanking region derived from tiie parent MOCG 133 strain to be 
concun-entiy isolated on the same Xhol fragment. In fact ttie Xhol site of the flanking 
sequence was found to be located approximately 1 kb away from the end on ttie 
transposon. A subfragment of the cloned X/io/ fragment derived exclusively from tiie -1 kb 
flanking sequence was tiien used to isolate tiie native (/.e. non-disrupted) gene region from 
a cosmid library of strain MOCG 134, The cosmid library was made from partially Sau3A 
digested MOCG 134 DNA. size selected for fragments of between 30 and 40 kb and cloned 
into tiie unique BamHI site of the cosmid vector pCIB1 19 which is a derivative of c2XB 
(Bates & Swift. Gene 26: 137-146 (1983)) and pRK290 (Ditta ef ai Proc, Nati. Acad- Sci. 
USA 77: 7247-7351 (1980)). pCIB119 is a double-cos site cosmid vector which has the 
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Wide host range RK2 origin of replication and can therefore replicate in Pseudomonas as 
well as £ coli. Several clones were Isolated from the MOCG 134 cosmid done library using 
the -1 kb flanking sequence as a hybridization probe. Of these one done was found to 
restore pyrrolnltrin production to the transposon Insertion mutant which had lost Its ability to 
produce pyrrolnltrin. This done had an insertion of -32 kb and was designated pCIB169. A 
viable cutture of E.coll DH5a comprising cosmid done pCIB169 has been deposited with the 
Agricultural Research Culture Collection (NRRL) at 1815 N. University Street. Peoria. Illinois 
61604 U.S.A. on May 20, 1994, under the accession nmber NRRL B-21256. 

Example 9: Moping and Tn5 Mutagenesis of pClBI 69 

The 32 kb Insert of done pCIB169 was subcloned into pCIB189 in £ co/? HB101. a 
derivative of pBR322 which contains a unique NotI doning site. A convenient NotI site 
within the 32 kb Insert as well as the presence of Wof/ sites flanking the BamH/ doning site 
of the parent cosmid vector pCIB119 allowed the subdoning of fragments of 14 and 18 kb 
into pCIB189. These denes were both mapped by restriction digestion and f igure 1 shows 
the result of this. X TnS transposon mutagenesis was canied out on both the 14 and 18 kb 
subdones using techniques well known in the art (e.g. de Bniijn & Lupski, Gene 2Z: 131- 
149 (1984). X Tn5 phage conferring kanamydn resistance was used to transfect both the 
14 and the 18 kb subdones described above. X TnS transfections were done at a 
multiplidly of Ihfectlon of 0.1 with subsequent selection on kanamydn. Following 
mutagenesis plasmid DNA was prepared and retransfomied Into £ coli HB101 with 
kanamydn selection to enable the isolation of plasmid dones canying TnS insertions. A 
total of 30 Independent TnS insertions were mapped along the length of the 32 kb insert 
(see figure 2). Each of these Insertions was crossed Into MOCG 134 via double 
homologous recombination and verified by Southern hybridization using the TnS sequence 
and the pCIB189 vedor as hybridization probes to demonstrate the occurrence of double 
homologous recombination i.e. the replacement of the wild-type MOCG 134 gene with the 
TnS-insertion gene. Pyrrolnltrin assays were perfonned on each of the insertions that were 
crossed into MOCG 134 and a genetic region of approximately 6 kb was identified to be 
involved in pyrrolnltrin produdion (see figures 3 and 5). This region was found to be 
centrally located in pCIB169 and was easily subdoned as an Xbal/NotI fragment into 
pBluescript II KS (Promega). The XbaVNotl subdone was designated pPRNS.9X/N (see 
f^ure 4). 



wo 95/33818 



FCT/IB95/00414 



-40- 



Example 1 0: Identification of Open Reading Frames In ttie Cloned Genetic Region 
The genetic region involved in pyrrolnitrin production was subcloned into six fragments for 
sequencing in the vector pBluescript II KS (see figure 4). These fragments spanned the ~6 
kb XibaMVof/ fragment described above and extended from the EcoRI site on the left side of 
figure 4 to the rightmost Hindlll site (see figure 4). The sequence of the inserts of clones 
PPRN1.77E. PPRN1.01E, pPRN1.24E, pPRN2.18E. pPRN0.8H/N, and pPRN2.7H was 
detennined using the Taq DyeDeoxy Temiinator Cycle Sequencing Kit supplied by Applied 
Biosystems. Inc., Foster City. CA. following the protocol supplied by the manufacturer. 
Sequencing reactions were tun on an Applied Biosystems 373A Automated DNA 
Sequencer and the raw DNA sequence was assembled and edited using the "INHERIT" 
software package also from Applied Biosystems, Inc.. A contiguous DNA sequence of 9.7 
kb was obtained corresponding to the EcoRI/Hindlll fragment of Figure 3 and bounded by 
EcoRI site # 2 and Hindlll site # 2 depicted in f igure 4. 

DNA sequence analysis was perfomned on the contiguous 9.7 kb sequence using the GCG 
software package from Genetics Computer Group. Inc. Madison.WI. The pattern 
recognition program "FRAMES" was used to search for open reading frames (ORFs) in all 
six translation frames of the DNA sequence. Four open reading frames were Identified 
using this program and the codon frequency table from 0RF2 of the gafA gene region 
which was previously published {WO 94/05793; figure 5). These ORFs lie entirely within the 
-6 kb Xba My/of/ fragment refered to in example 9 (figure 4) and are contained wHhin the 
sequence disclosed as SEQ ID N0:1. By comparing the codon frequency usage table from 
MOCG134 DNA sequence of the gafA region to these four open reading frames, very few 
rare codons were used Indicating that codon usage was similar in both of these gene 
regions. This strongly suggested tiiat the four open reading frames were real. At a 3' 
position to the fourth reading frame numerous p-independent stem loop structures were 
found suggesting a region where transcription could be stopped, tt was thus apparent that 
all four ORFs were translated from a single transcript. Sequence data obtained for the 
regions beyond the four identified ORFs revealed a fifth open reading frame which was 
subsequently detennined to not be involved In pyrrolnitrin synthesis based on £ coll 
expression studies. 
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For each open reading frame (ORF) in tiie pyrrolnitrin gene cluster multiple putative 
translation start sites were Identified by the presence of an in-frame start codon (ATG or 
GTG) and an upstream rilsosome binding site. A complementation approach was used to 
Identify the actual translation start site for each gene. PGR primers were synthesized to 
amplify segments of each pm gene from upstream of one of the putative ribosome binding 
sites to downstream of the stop codon (Table 1). The plasmid pPRNIBNot (1506 CIP3, 
Figure 4) was used as the template for PGR reactions. The PGR products were cloned in 
the vector pRK(KK223-3MGS) which consists of the Ptac promoter and rrs tenninator from 
pKK223-3 (Pharmacia) and pRK290 backbone. Plasmids containing each constmct were 
mobilized into the respective ORF-deletion mutants of M0CG134 as described in example 
12 and by triparental matings using the helper plasmid pRK290 In E. coll HB101. 
Transconjugants were selected by plating on Pseudomonas minimal medium supplemented 
with 30 mg/1 tetracycline. The presence of the plasmids and con-ect orientations of the 
inserted PGR product were verified by plasmid DMA preparation, restriction digestion and 
agarose gel electrophoresis. Pyrrolnitrin production was detenmined by extraction and TLG 
assay as in example 11. For each pm gene the shortest clone restoring pyrrolnitrin 
production (i.e., complementing the ORF deletion) was judged to contain the actual 
translation initiation site. Thus, the initiation codons were identified as follows: 0RF1 - ATG 
at nucleotide position 423, ORF2 - GTG at nucleotide position 2026, ORF3 - ATG at 
nucleotide position 3166. and 0RF4 - ATG at nucleotide position 4894. The pattern 
-FRAMES" computer program used to indentify the open reading frames only recognizes 
ATG start codons. Using the complementation approach describe here It was determined 
that 0RF2 actually starts witii a GTG codon at nucleotide position 2039 and is thus longer 
than the open reading frame identified by ttie FRAMES" program. 
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Table 1 : DNA constructs and hosts used to identify translation initiation sites in the 
pyrrolnitrin gene cluster^. 



construct 


oian 01 
amplified 
segment 


r^uiaiive 
Start 
oodon" 


ovop 
codon" 


Clio Ul 

amplified 
segment 


Host 
strain 


Pvrrolnitrin 
production 


0RF1-1 


294 


357 


2039 


2056 


0RF1D 


+ 


0RF1-2 


396 


423 


2039 


2056 


0RF1D 


+ 


0RF1-3 


438 


477 


2039 


2056 


0RF1D 


** 


0RF2-1 


2026 


2039 


3076 


3166 


0RF2D 


+ 


ORF2-2 


2145 


2162 


3076 


3166 


0RF2D 




ORF2-3 


2249 


2215 


3076 


3166 


ORF2D 




0RF3-1 


3130 


3166 


4869 


4904 


0RF3D 




ORF3-2 


3207 


3235 


4869 


4904 


. 0RF3D 




ORF3-3 


3329 


3355 


4869 


4904 


ORF3D 




ORF4-1 


4851 


4894 


5985 


6122 


0RF4D 




ORF4-2 


4967 


4990 


5985 


6122 


0RF4D 




ORF4-3 


5014 


5086 


5985 


6122 


0RF4D 





" All nucleotide position numbers refer to the Sequence of the Pynrolnitrin Gene Cluster 

given in SEQ ID No. 1 
^ The first base of the putative start codon 

The last base of the stop codon 
" ORF deletion mutants are described in Ex^ple 12 



Example 1 1 : Expression of Pyrrolnitrin Biosynthetic Genes in £. eo// 

To detennine if only four genes were needed for pyrrolnitrin production, these genes were 
transfen-ed into E. co// which was then assayed for pyn-olnitrin production. The expression 
vector pKK223-3 was used to over-express the cloned operon In £ colL (Brosius & Holy, 
Proc. Natl. Acad. Sci. USA 81* 6929 (1984)). pKK223-3 contains a strong tac promoter 
which, in the appropriate host, is regulated by the lac repressor and induced by the addition 
of isopropyl-p-D-thiogalactoside (IPTG) to the bacterial growth medium. This vector was 
modified by the addition of further useful restriction sites to the existing multiple cloning site 
to facilitate the cloning of the -6 1^ XbaWWotf fragment (see example 7 and figure 4) and a 
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10 kb Xbal/Kpnl fragment (see figure 4) for expression studies. In each case the cloned 
fragment was under the control of the E coli tac promoter (with IPTG induction), but was 
cloned in a transcriptional fusion so that the ribosome binding site used would be that 
derived from Pseudomonas. Each of these clones was transformed into E. co// XLI-blue 
host cells and induced with 2.5 mM IPTG before being assayed for pyn-olnitrin by thin layer 
chromatography. Cultures were grown for 24 h after IPTG induction in 10 ml L broth at 
37 C with rapid shaking, then extracted with an equal volume of ethyl acetate. The organic 
phase was recovered, allowed to evaporated under vacuum and the residue dissolved in 20 
I of methanol. Silica gel thin layer chromatography (TLC) plates were spotted with 10 I of 
extract and run with toluene as the mobile phase. The plates were allowed to dry and 
sprayed with van Uri^*s reagent to visualize. Urk's reagent comprises 1g p- 
Dimethylaminobenzaldehyde in 50 ml 36% HCL and 50 ml 95% ethanol. Under these 
conditions pynrolnitrin appears as a puiple spot on the TLC plate. This assay confirmed the 
presence of pynrolnitrin in both of the expression constructs. HPLC and mass spectrometry 
analysis further confirmed the presence of pyrrolnitrin in both of the extracts. HPLC 
analysis can be undertaken directly after redissolving in methanol (in this case the sample is 
redissolved in 55 % methanol) using a Hewlett Packard H^ersil ODS column (5 ^M) of 
dimensions 100 x 2.1 mm.. Pyrrolnitrin elutes after about 14 mtn. 

Example 11 a: Construction of strain MOCG134cPrn having pyrrolnitrin biosynthetic 
genes under a constitutive promoter 

Transcription of the pynrolnitrin biosynthetic genes is regulated by gafA. Thus, transcription 

and Pynolnitirin production does not reach high levels until lata log and stationary growth 

phase. To increase pynrolnitrin biosynthesis in eariier growth phases the endogenous 

promoter was replaced with the strong constitutive £. coli tac pn^moter. The Pm genes were 

cloned between the tac promoter and a strong tenninator sequence as described in 

example 1 1 above. The resulting synthetic operon was Inserted into a genomic done that 

had the Pm biosynthetic genes deleted but has homologous sequences both upstream and 

downstream of the insertion site. This clone was mobilized into strain MOCG134_Pm, a 

deletion mutant of tiie genes Pm A*D. The Pm genes under the control of tiie constitutive 

tac promoter were inserted into tiie bacterial chromosome via double homologous 

recombination. The resultant strain MOCG134cPm was shown to produce Pyrrolnitrin 

eariier than the wild-type strain. 
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PyiTOlnitrin production of the wild type strain MOCG134, of strain M0CG134cPm, and of a 
strain containing plasmid borne PRN genes under tlie control of tfie tac promoter 
(MOCG134pPm) was assayed at various time points (14, 17, 20, 23 and 26 hours growth). 
Cultures were inoculated with a 1/10.000 dilution of a stationary phase culture, Pyrrolnitrin 
was extracted with ethyl acetate, and the amount of Pyn-olnitrin was determined by 
integrating the peak area of Pyn-olnitrin detected by HPLC at 212 nm. The results shown in 
Table 3 clearly indicate that strains containing the Pm genes under the control of the tac 
promoter produce Pyrronnitrin much earlier than the wilde type MOCG134 strain. The new 
strains produce Pyn-olnitrin independent of gafA and are useful as new biocontrol strains. 



Table 3 : Pyrrolnitrin production of different strains at different time points 



time of qrowlh (houri) ! 








14 


1250 


7100 


18300 


17 


3500 


14600 


26700 


20 


9600 


16600 


32100 


23 


17500 


18900 


31000 


26 


25000 


22500 


33500 



Example 1 2: Construction of Pyrrolnitrin Gene Deletion Mutants 
To further demonstrate the involvement of the 4 ORFs in pyn-olnftrln biosynthesis, 
independent deletions were created in each ORF and transferred back into Pseudomonas 
fluorescens strain |y40CG134 by homologous recombination. The plasmids used to 
generate deletions are depicted in Figure 4 and the positions of ttie deletions are shown In 
Figure 6. Each ORF Is identified within the sequence disclosed as SEQ ID N0:1. 

0RF1 (SEQ iD N0:2): 

The plasmid pPRNI .77E was digested with Mu/ to liberate a 78 bp fragment intemeJIy from 
0RF1. The remaining 4.66 kb vector-containing fragment was recovered, rei^ated with T4 
DNA ligase. and transformed into the E. con host strain DH5a. This new plasmid was 
linearized with Mlul and the Klenow large fragment of DNA polymerase I was used to 
create blunt ends (Maniatis et at. Molecular Cloning, Cold Spring Harbor Laboroatory 
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(1982)). The neomycin phosphotransferase II (NPTII) gene cassette from pUC4K 
(Pharniada) was ligated Into the plasmW by blunt end ligation and the new construct, 
designated pBS(ORFlA). was transfonned Into DHSa. The constnict contained a 78 bp 
deletion of 0RF1 at which position the NPTII gene conferring kanamycin resistance had 
been Inserted. The insert of this plasmid (/.e. 0RF1 with NPTII insertion) was then excised 
from the pBIuescrlpt II KS vector with EooRI. ligated into the EcoRI site of the vector 
pBR322 and transformed Into the £. coff host strain HB1 01. The new plasmid was verified 
by restridion enzyme digestion and designated pBR322(ORF1A). 

0RF2 (SEQ ID N0:3): 

The plasmlds pPRN1^4E and pPRNI.OlE containing contiguous EcoRI fragments 
spanning ORF2 were double digested with EcoRI and XhoL The 1.09 kb fragment from 
.PPRN1.24E and the 0.69 Kb fragment from pPRN1.01E were recovered and ligated 
together Into the £cof?/ site of pBR322. The resulting plasmid was transfonned Into the 
host strain DHSa and the constmct was verified by restrtetlon enzyme digestion and 
electrophoresis. The plasmid was then Bnearized with Xfio/, the NPTII gene cassette from 
pUC4K was inserted, and the new constnjct. designated pBR(0RF2A), was transfonned 
Into HB101. The constmct was verified by restriction digestions and agarose gel 
electrophoresis and contains NPTII within a 472 bp deletion of the 0RF2 gene. 

ORF3 (SEQ ID N0:4): 

The plasmid pPRN2.56Sph was digested with ffsf/ to Hberate a 350 bp fragment The 
remaining 2.22 kb vector-containing fragment was recovered and the NPTII gene cassette 
from pUC4K was ligated Into Vne PstI she. This Intermediate plasmid. designated 
pUC(0RF3A), was transfonned Into DH5a and verified by restriction digestion and agarose 
gel electrophoresis. The gene deletion constnjcl was excised from pUC with SphI and 
ligated into the SphI site of pBR322. The new plasmid. designated pBR(0RF5A), was 
verified by restriction enzyme digestion and agarose gel electrophoresis. This plasmid 
contains the NPTII gene within a 350 bp deletion of the 0RF3 gene. 

ORF4(SEQIDNO:5): 

The plasmid pPRN2.18E/N was digested with Aatll to liberate 156 bp fragment The 
remaining 2.0 kb vector-containing fragment was recovered, reiigated. transfonned into 
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DHBo. and verified by restriction enzyme digestion and electrophoresis. The new plasmid 
was linearized with Aatll and T4 DNA polymerase was used to create blunt ends. The 
NPTII gene cassette was ligated into the plasmid by blunt-end ligation and the new 
constmct, designated pBS(0RF4A), was transfonned into DHScl The insert was excised 
from the pBluescript II KS vector with £co/?/. ligated into the EcoRI site of the vector 
pBR322 and transfonned into the E. coll host strain HB101. The identity of the new 
plasmid, designated pBR(0RF4A), was verified by restriction enzyme digestion and agarose 
gel electrophoresis. This plasmid contains the NPTII gene within a 264 bp deletion of the 
0RF4 gene. 

KmR Control: 

To control for possible effects of the kanamycin resistance maricer, the NPTII gene cassette 
from pUC4K was inserted upstream of the pynolnitrin gene region. The plasmid pPRN2.5S 
(a subclone of pPRN7.2E) was linearized with PstI and the NPTII cassette was ligated Into 
the PstI site. This Intennediate plasmid was transfonned into DH5a and verified by 
restriction digestions and agarose gel electrophoresis. The gene insertion constmct was 
excised from pUC with SphI and ligated into the SphI site of pBR322. The new plasmid. 
designated pBR{2.5SphlKmR), was verified by restriction enzyme digestion and agarose gel 
electrophoresis. It contains the NPTII region inserted upstream of the pym>lnitrin gene 
region. 

Each of the gene deletion constructs was mobilized into MOCG134 by triparental mating 
using the helper plasmid pRK2013 in £ coli HB101. Gene replacement mutants were 
selected by plating on Pseudomonas Minimal Medium (PMM) supplemented with 50 jig/ml 
kanamycin and counterselected on PMM supplemented witti 30 jig/ml tetracycline. Putative 
perfect replacement mutants were verified by Southern hybridization by probing EcoRI 
digested DNA with pPRNISNot, pBR322 and an NPTII cassette obtained from pUC4K 
(Phannacia 1994 catalog no. 27-4958-01). Verification of perfect hybridization was 
apparent by lack of hybridization to pBR322. hybridization of pPRNISNot to an 
appropriately size-shifted EcoRI fragment (reflecting deletion and insertion of NPTII), 
hybridization of the NPTII probe to the shifted barid, and the disappearance of a band 
corresponding a deleted fragment 
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After verification, deletion mutants were tested for production of pyrrolnitrin, 2-hexyl-5- 
propyl-resordnol. (yanide, and chitinase production. A deletion in any one of the ORFs 
abolished pyrrolnitrin production, but did not affect production of the other substances. The 
presence of tiie NPTII gene cassette in ttie KmR control had no effect on the production of 
pyrolnltrin, 2-hexyl-5-propyl-resorcinol. cyanide or chitinase. These experiments 
demonstrated ttie requirement of each of tiie four ORFs for pyrrolnitrin production. 

Example 12a: Cloning of the coding regions for expression in plants 
The coding regions of ORFs 1.2,3, and 4 were designated pmA, pmB, pmC and pmD. 
respectively. Primers were designed to PGR amplify ttie coding regions for each pm gene 
from tiie start codon to or beyond tiie stop codon as shown in Table 2. Additionally, the 
primers were designed to add restriction sites to ttie ends of tiie coding regions and in tiie 
case of pmB to change ttie initiation codon for pmB from GTG to ATG. Plasmid 
pPRNIBNot (Figure 4) was used as template for ttie PCR reactions. The PGR products 
were cloned into pPEH14 for functional testing. Plasmid pPEH14 Is a modification of 
pRK(KK223-3) which contains a synttietic ribosome binding site 1 1 to 14 bases upstream of 
the start codons of tiie cloned PCR products. The constructs were mobilized Into the 
respective ORF deletion mutants by triparental matings as described eariier. The presence 
of each plasmid and ttie corrert orientation of tiie inserted PGR product were confirmed by 
plasmid DMA exttaction. restiiction digestion, and agarose gel electrophoresis. Pyrrolnitrin 
production of tiie complemented mutants was confirmed as described in example 11. 

After the expression of a functional protein by each coding region was verified fue.. ttie 
ability to restore pyrrolnitiin production to an ORF deletion mutant was demonstrated) ttie 
clones were sequenced and compared to the sequence of ttie pyrrolnitrin gene cluster 
(1506 GIP3). For pmA, pmB and pmC ttie sequence of ttie amplified coding regions were 
identical to ttie original gene cluster sequences. For pmD ttiere was a single base change 
at nucleotide position 5605 from G in ttie orig^al sequence to A bi ttie amplified co(fing 
region. This base change results In a change from glycine to serine In ttie deduced amino 
add sequence, but does not affect function of ttie gene product according to ttie 
complementation tests described £d30ve. 
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Table 2 : Coding regions of the pm genes* 



Coding 


Start of 


Start 


Stop codon' 


End of 


region 


amplified 


codon" 




arr^ified 




segment 






segment 


pmA 


423 


423 


2039 


2055 


pmB 


2039 


2039 


3076 


3081 


pmC 


3166 


3166 


4869 


4075 


pmD 


4894 


4894 


5985 


5985 



* All nucleotide position numbers refer to Sequence ID No. 1 

^ The first base of the start codon. 

^ The last base of the codon, 



Example 12b: Expression of prn genes in plants 

The coding regions for each pm gene, described in example 12a above were subdoned into a 
plant expression cassette consisting of the CaMV 35S promoter and leader and the CaMV 35S 
temninator flanked by Xba I restriction sites. Each construct comprising promoter, coding region, 
and terminator was liberated vwth Xba I. subdoned into the binary transfonnation vector 
pCIB200, and then transfomied into Agrobacterium tumifaciens host strain A136. Tobacco 
transfomnation was carried out as described by Horsch et al., Sdence 227: 1229-1231, 1985). 
Anabidopsis transformation was canied out as described by Uoyd et al, Sdence 234:464-466, 
1986. Plantlets were selected and regenerated on medium containing lOOmg/L kanamydn and 
500 mg/L cart3eneciliin. 

Tobacco leaf tissue was harvested from individual plants that were suspected to be 
transformed. Arabidopsis leaf tissue from about 10 independent plants suspected to be 
transfomied was pooled for each gene construct used for transfonnation. RNA was purified by 
phenohchloroform extraction and fractionated by formaldehyde gel electrophoresis before 
blotting onto nylon membranes. Probes to each coding region were made using the random 
primed labeling meUiod. Hybridization was carried out in 50% fonnamide at 42*^0 as described 
by Sambrook et al., Molecular Cloning, 2nd ed., Cold Spring Harbor Laboratory, 1989. 
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For each pm gene, transgenic tobacco plants were identified which produced RNA bands 
hybridizing strongly to the appropriate pm gene probe and showing the size expected for a 
mRNA transcribed from the relevant pm gene. Similiar bands were also seen In RNA 
extracted from the pooled samples of Arabidopsis tissue. The data demonstrate mat 
mRNAs encoding ttie enzymes of the pyrrolnitrin biosynttietic patiiway accumulate In 
transgenic plants. 



D. Cloning of Resorcinol Biosvnthetic Genes from Pseudomonas 
2-hexyl-5-propyl-resorcinol Is a further APS produced by certain strains of Pseudomonas. It 
has been shown to have antipathogenic activity against Gram-positive bacteria (in particular 
C/ai^acferspp.). mycobacteria, and fungi. 

Example 13: Isolation of Genes Encoding Resorcinol 

Two transposon-lnsertion mutants have been isolated which lack the ability to produce the 
antipatiiogenic substance 2-hexyl-5-propyl-resorclnol which is a furtiier substance known to 
be under the global regulation of the gafA gene In Pseudomonas fluorescens (WO 
94/01561). The insertion transposon TnCIB116 was used to generate libraries of mutants 
in MOCG134 and a gafA' derivative of MOCG134 (BL1826). The fomier was screened for 
changes in fungal inhibition in vitro; tne latter was screened for genes regulated by gafA 
after introduction of gafA on a plasmid (see Section C). Selected mutants were 
characterized by HPLC to assay for production of known compounds such as pyrrolnitrin 
and 2-hexyl-5-propyl-resorclnoL The HPLC assay enabled a comparison of the novel 
mutants to ttie wild-type parental strain. In each case, the HPLC peak corresponding to 2- 
hexyl-5-propyl-resorcinol was missing in tiie mutant The mutant derived from MOCG134 is 
designated BL1 846. The mutant derived from BL1826 is designated BL1911. HPLC for 
resorcinol follows ttie same procedure as for pyrrolnitrin (see example 11) except that 100% 
metiianol is applied to tiie column at 20 min to elute resorcinol. 

The resorcinol biosynttietic genes can be ctoned from ttie above-Uentifled mutants In ttie 
following manner. Genomic DNA is prepared from ttie mutants, and clones containing ttie 
transposon insertion and adjacent Pseudomonas sequence are obtained by selecting for 
kanamydn resistant ctones (kanamydn resistance is encoded by tt»e transposon). The 
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cloned Pseudomonas sequence is then used as a probe to identify the native sequences 
from a genomic library of P. f/i/orescens M0CG134. The cloned native genes are likely to 
represent resorcinol biosynthetic genes. 

E. Cloning Soraphen Biosynthetic Genes from Soranaium 

Soraphen is a polyketide antibiotic produced by the myxobacterium Sorangium cellulosum. 
This compound has broad antifungal activities which make it useful for agricultural 
applications. In particular, soraphen has activity against a broad range of foliar pathogens. 

Example 14: Isolation of the Soraphen Gene Cluster 

Genomic DNA was isolated from Sorangium cellulosum and partially digested with Sau3A. 
Fragments of between 30 and 40 kb were size selected and cloned into the cosmid vector 
pHC79 (Hohn & Collins, Gene H: 291-298 (1980)) which had been previously digested with 
BamHI and treated with alkaline phosphatase to prevent self ligation. The cosmid library 
thus prepared was probed with a 4.6 kb fragment which contains the gral region of 
Streptomyaes violaceoruber strain Tu22 encoding ORFs 1-4 responsible for the 
biosynthesis of granaticin in S. violaceoruber. Cosmid clones which hybridized to the gral 
probe were identified and DNA was prepared for analysis by restriction digestion and further 
hybridization. Cosmid p98/1 was identified to contain a 1.8 kb Sail fragment v^ich 
hybridized strongly to the gral region; this Sa// fragment was located within a larger 6.5 kb 
Pw/ fragment within the -40 kb insert of p98/1. Determination of the sequence of part of 
tiie 1.8 kb Sa// insert revealed homology to the acetyltransferase proteins required for tiie 
synthesis of erythromycin. Restriction mapping of the cosmid p98/1 was undertaken and 
generated the map depicted in figure 7. A viable culture of E.coli HB101 comprising cosmid 
clone 98/1 has been deposited with tiie Agricultural Research Culture Collection (NRRL) at 
1815 N. University Street, Peoria, Illinois 61604 U.S.A. on May 20, 1994, under tiie 
accession number NRRL B-21255. The DNA sequence of tiie soraphen gene cluster is 
disclosed in SEQ ID N0:6. 

Example 15: Functional Analysis of the Soraphen Gene Cluster 

The regions within p98/1 tiiat encode proteins witii a role in the biosynthesis of soraphen 
were identified through gene disruption experiments. Initially, DNA fragments were derived 
from cosmid p98/1 by restriction with Pvul and cloned into tiie unique Pvul cloning site 
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(whlch is within the gene for ampicillin resistance) of the wide host-range plasmid 
PSUP2021 (Simon at aL In: Molecular Genetics of the Bacteria-Plant Interaction {ed.: A 
Puhler), Springer Verlag, Berlin pp 98-106 (1983)). Transfomied E coli HB101 was 
selected for iBsistance to chloramphenicol, but sensitivity to ampicillin. Selected colonies 
carrying appropriate inserts were transferred to Sorangium cellulosum SJ3 by conjugation 
using the method described in the published application EP 0 501 921 (to Ciba-Geigy). 
Plasmids were transferred to E. coli ED8767 carrying the helper plasmid pUZB (Hedges & 
Mathew, Plasmid 2: 269-278 (1979)) and the donor cells were incubated with Sorangium 
cellulosum SJ3 cells from a stationary phase culture for conjugative transfer essentially as 
described in EP 0 501 921 (example 5) and EP the later app, (example 2). Selection was 
on kanamydn. phleomydn and streptomydn. It has been detemiined that no plasmids 
tested thus far are capable of autonomous replication in Sorangium cellulosum. but rather, 
integration of the entire plasmid into the chromosome by homologous recombination occurs 
at a site vwthin the doned fragment at low frequency. These events can be selected for by 
the presence of antibiotic resistance markers on the plasmid. Integration of tiie plasmid at a 
given site results In the insertion of tiie plasmid into the chromosome and the concomitant 
disaiption of this region from tills event Therefore, a given phenotype of interest, 
/.e.soraphen production, can be assessed, and disaiption of ttie phenotype will indicate that 
ttie DMA region doned into tiie plasmid must have a role in the determination of this 
phenotype. 

Recombinant pSUP2021 dones with P\ntl inserts of approximate aze 6.5 kb (pSN105/7). 
10 kb (pSNI 20/10), 3.8 kb (pSNI 20/43-39) and 4.0 kb (pSN120/46) were selected. The 
map locations (in kb) of ttiese Pvul inserts as shown in Figure 7 are: pSN105/7 - 25.0-31 .7, 
pSN120/10 - 2.5-14.5. pSN120/43-39 - 16.1-20.0, and pSN120/46 - 20.0-24.0. pSN105/7 
was shown by digestion witt) Pvul and Sail to contain tiie 1.8 kb fragment referred to ^ve 
in example 11. Gene disruptions with ttie 3.8, 4.0, 6.5, and 10 kb Pvul fragments all 
resulted in tiie elimination of soraphen production. These results indicate tiiat all of ttiese 
fragments contain genes or fragments of genes witti a role in ttie production of ttiis 
compound. 

Subsequently gene dismption experiments were perfomied witti two ^///fragments derived 
from cosmid p98/1. These were of size 3.2 kb (map location 32.4-35-6 on Figure 7) and 2.9 
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kb (map location 35.6-38.5 on Figure 7). These fragments were cloned into the BamHI site 
of ptasmid pCIB132 that was derived from pSUP2021 according to Figure 8. The ~5 kb 
A/of/ fragment of pSUP2021 was excised and inverted, followed by the removal of the - 3kb 
Sa/nA// fragment. Neither of these fragments was able to dismpt soraphen 
biosynthesis when reintroduced into Sorangium using the method described above. This 
indicates that the DNA of these fragments has no role in soraphen biosynthesis. 
Examination of the DNA sequence indicates the presence of a thioesterase domain 5* to, 
but near the Bglll site at location 32.4. In addition, there are transcription stop codons 
immediately after the thioesterase domain which are likely to demarcate the end of tiie 
0RF1 coding region. As the 2.9 and 3.2 kb Bglll fragments are immediately to the right of 
these sequences It is likely that there are no other genes downstream from 0RF1 that are 
involved in soraphen biosynthesis. 

Delineiation of the left end of the biosynthetic region required tiie isolation of two other 
cosmid clones, pJL1 and pJL3, that overiap pgs/l on the left end, but include more DNA 
leftwards of p98/1 . These were Isolated by hybridization with the 1 .3 kb BamHI fragment on 
the extreme left end of p98/1 (map iocatton 0.0-1.3) to the Sorangium cellulosum gene 
library. It should be noted that the BamHI site at 0.0 does not exist in ttie S. ^Ilulosum 
chromosome but was formed as an artifact from the ligation of a Sau3A restriction fragment 
derived from the Sorangium celiuiosum genome into the BamHI cloning site of pHC79. 
Southern h)i3ridization with the 1.3 kb ea/nH/ fragment demonstrated that pJLI and pJLS 
each contain an app^ximately 12.5 kb 6amH/ fragment that contains sequences common 
to the 1.3 kb fragment as this fragment is in fact delineated by the BamHI site at position 
1.3. A vi£A>le culture of E.coli HB101 comprising cosmid clone pJL3 has been deposited with 
the Agricultural Research Culture Collection (NRRL) at 1815 N. Universify Street, Peoria, 
Illinois 61604 U.S.A. on May 20, 1994, under the accession number NRRL B-21254. Gene 
disruption experiments using the 12.5 kb BamHI fragment indicated that this fragment 
contains sequences that are involved in the synthesis of soraphen. Gene disruption using 
smaller EcoRV^ fragments derived from this region indicated the requirement of this region 
for soraphen biosynthesis. For example, two EcoffV^ fragments of 3.4 and 1.1 Id) located 
adjacent to the distal BamHI site at the left end of the 12.5 M) fragment resulted in a 
reduction In soraphen biosynthesis when used in gene disruption experiments. 
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Example 1 6: Sequence Analysis of the Soraphen Gene Cluster 
The DNA sequence of the soraphen gene cluster was determined from the Pvul site at 
position 2.5 to the Bglll site at position 32.4 (see Figure 7) using the Taq (DyeDeoxy 
Temiinator Cyde Sequencing Kit supplied by Applied Biosystems, Inc., Foster City. CA. 
following the protocol supplied by the manufacturer. Sequendng reactions were run on a 
Applied Biosystems 373A Automated DNA Sequencer and the raw DNA sequence was 
assembled and edited using the "INHERFT software package also from Applied 
Biosystems, Inc.. The pattern recognition program "FRAMES" was used to search for open 
reading frames (ORFs) in all six translation frames of the DNA sequence. In total 
approximately 30 kb of contiguous DNA was assembled and this con^esponds to the region 
determined to be critical to soraphen biosynthesis in the disruption experiments described in 
example 12. This sequence encodes two ORFs which have the structure described below. 

ORF1: 

0RF1 is approxinmtely 25.5 kb in size and encodes five biosynthetic modules vAth 
homology to the modules found in the erythromycin biosynthetic genes of 
Saccharopolyspora erythraea (Donadio etaL Science 252: 675-679 (1991)). Each module 
contains a p-ketoacylsynthase (KS). an acyltransferase (AT), a ketoreductase (KR) and an 
acyl canier protein (ACP) domain as well as p-ketone processing domains which may 
include a dehydratase (DH) and/or enoyi reductase (ER) domain. In the biosynthesis of the 
polyketide structure each module directs the Incorporation of a new two carbon extender 
unit and the correct processing of the p-ketone cart)on. 

0RF2: 

In addition to 0RF1, DNA sequence data from the p98/1 fragment spanning the Pvul site at 
2.5 kb and ttie Smal site at 6.2 kb, indicated the presence of a further ORF (0RF2) 
immediately adjacent to ORF1. The DNA sequence demonstrates tiie presence of a typical 
biosynthetic module that appears to be encoded on an ORF whose 5* end is not yet 
sequenced and is some distance to the left By comparison to other polyketide biosynthetic 
gene units and the number of carbon atoms in the soraphen ring structure it is likely that 
there should be a total of eight modules in order to direct the synthesis of 17 carbon 
molecule soraphen. Since there are five modules in 0RF1 described above, it was 
predicted that 0RF2 contains a further three and that these would extend beyond the left 
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end of cosmid p98/1 (position 0 in Figure 7). This is entirely consistent with the gene 
description of example 12. The cosmid clones pJL1 and pJL3 extending beyond the left 
end of p98/1 presumable carry the sequence encoding the remaining modules required for 
soraphen biosynthesis. 

Example 17: Soraphen: Requirement for Methytation 

Synthesis of polyketides typically requires, as a first step, the condensation of a starter unit 
(commonly acetate) and an extender unit (malonate) with the loss of one carbon atom in the 
form of COz to yield a three-carbon chain. All subsequent additions result in the addition of 
two carbon units to the polyketide ring (Donadio etaL Science 252: 675-679 (1991)). Since 
soraphen has a 17-cartDons ring, it is likely that there are 8 biosynthetic modules required 
for its synthesis. Five modules are encoded in 0RF1 and a sixtii is present at the 3* end of 
0RF2. As explained above, it is likely that the remaining two modules are also encoded by 
0RF2 in the regions that are in the 15 kb BamHI fragment from pJLI and pJL3 for which 
the sequence has not yet been determined. 

The polyketide modular biosynthetic apparatus present in Somngium cellulosum is required 
for the production of the compound, soraphen C, which has no antipathogentc activity. The 
structure of this compound is the same as that of the antipathogenic soraphen A witii the 
exception ttiat the 0-metiiyl groups of soraphen A at positions 6, 7, and 14 of the ring are 
hydroxyl groups. These are methylated by a specific metiiyltransferase to fonn the active 
compound soraphen A. A similar situation exists in the biosyntiiesis of eryttiromydn in 
Sacdtaropolyspom eryOimea. The final step in the biosynthesis of this molecule is the 
methytation of three hydroxl groups by a methyltransferase (Haydock et al., MoL Gen. 
Genet. 230: 120-128 (1991)). It is highly likely, therefore, tiiat a similar methyltransferase 
(or possibly more than one) operates in the biosynthesis of soraphen A (soraphen C is 
unmethylated and soraphen B is partially methylated). In all polyketide biosynthesis 
systems examined thus far, all of the biosyntiietic genes and associated methyiases are 
clustered togetiier (Summers etal. J Bacterid 174: 1810-1820 (1992)). It is also probable, 
tiierefore, that a similar situation exists in tiie soraphen operon and ttiat the gene encoding 
tiie methyltransferase/s required for the conversion of soraphen B and C to soraphen A is 
located near the 0RF1 and 0RF2 tiiat encode the polyketide synthase. The results of the 
gene disruption experiments described above indicate that this gene is not located 
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ImmediatGly downstream from the 3' end of 0RF1 and that it is likely located upstream of 
0RF2 in the DNA contained in pJL1 and pJL3. Thus, using standard techniques in the art. 
the methyltransferase gene can be cloned and sequenced. 

Soraohen Determination 

Sorangium cellulosum cells were cultured in a liquid growth medium containing an 
exchange resin. XAD-5 (Rohm and Haas) (5% w/v). The soraphen A produced by the cells 
bound to the resin which was collected by filtration through a polyester filter (Sartorius B 
420-47-N) and the soraphen was released from the resin by extraction with 50 ml 
isopropanol for 1 hr at 30 C. The isopropanol containing soraphen A was collected and 
concentrated by drying to a volume of approximately 1 ml. Aliquots of this sample were 
analyzed by HPLC at 210 nm to detect and quantify the soraphen A. This assay procedure 
is specific for soraphen A (fully methylated); partially and non-methylated soraphen fomis 
have a different Rt and are not measured by this procedure. This procedure was used to 
assay soraphen A production after gene disruption. 

F. Cioning and Characterization of Phenazine Biosvnthetic Genes from 

Pseudomonas aureofaciens 
The phenazine antibiotics are produced by a variety of Pseudomonas and Streptomyces 
species as secondary metabolites branching off the shikimic add pathway, it has been 
postulated that two chorismic add molecules are condensed along with two nitrogens 
derived from glutamine to form the three-ringed phenazine pathway precursor phenazine- 
1.6-dicarboxylate. However, there is also genetic evidence that anthranilate is an 
intennediate between chorismate and phenazine-1,6-dicarboxyiate (Essar ef a/.. J. 
Bacterid. 172: 853-866 (1990)). In Pseudomonas aureofaciens 30-84. production of three 
phenazine antibiotics, phenazine-1-carboxylic add. 2-hydroxyphenazine-1-carl)oxyric add, 
and 2-hydroxyphenazine, is the major mode of action by which the strain protects wheat 
from the fungal phytopathogen Gaeumannomyces graminis var. Irfl/d (Pierson & 
Thomashow, MPMI 5: 330-339 (1992)). Ukewise. in Pseudomonas fluorescens 2-79. 
phenazine production is a major factor in the control of G. graminis var. ftWc/ (Thomashow & 
Weller, J. Bacterid. 170: 3499-3508 (1988)). 
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Example 18: Isolation of the Phenazine Biosynthettc Genes 

Pierson & Thomashow (supra) have previously described the cloning of a cosmid which 
confers a phenazine biosynthesis phenotype on transposon insertion mutants of 
Pseudomonas aureofaciens strain 30-84 which were disrupted in their ability to synthesize 
phenazine antibiotics. A mutant library of strain 30-84 was made by conjugation with £ coli 
S17-1(pSUP1021) and mutants unable to produce phenazine antibiotics were selected. 
Selected mutants were unable to produce phenazine carboxylic acid, 2-hydroxyphenaxine 
or 2-hydroxy-phenazine carboxylic acid. These mutants were transformed by a cosmid 
genomic library of strain 30-84 leading to the isolation of cosmid pLSP259 which had the 
ability to complement phenazine mutants by the synthesis of phenazine carboxylic acid, 2- 
hydroxyphenazine and 2-hydroxy-phenazinecarboxylic acid. pLSP259 was further 
characterized by transposon mutagenesis using the XvJnS phage described by de Bruijn & 
Lupski (Gene 27: 131-149 (1984)). Thus a segment of approximately 2.8 kb of DMA was 
identified as being responsible for the phenazine complementing phenotype; this 2.8 kb 
segment is located within a larger 9,2 kb EcoRI fragment of pLSP259. Tnansfer of the 9.2 
kb £co/?/ fragment and various deletion derivatives thereof to E. coli under the control of 
the /acZ promoter was undertaken to assay for the production in E. coli of phenazine. The 
shortest deletion derivative which was found to confer biosynthe^s of all three phenazine 
compounds to E. coli contained an insert of approximately 6 kb and was designated 
pLSP18-6H3del3. This plasmid contained the 2.8 kb segment previously kientified as being 
critical to phenazine biosynthesis In the host 30-84 strain and was provided by Dr LS 
Pierson (Department of Plant Pathology. U Arizona, Tucson. A2) for sequence 
characterization. Otiier deletion derivatives were able to confer production of phenazine- 
cart30xylic acid on £. coli, without the accompanying production of 2-hydroxyphenazine and 
2-hydroxyphenazinecarboxylic acid suggesting that at least two genes might be involved in 
tiie syntiiesis of phenazine and its hydroxy derivatives. 

The DNA sequence comprising the genes for tiie biosyntiiesis of phenazine is disclosed in 
SEQ ID N0:17. Plasmid pCIB3350 contains the Pstl-I^indlll fragment of the phenazine gene 
cluster and has been deposited with the Agricultural Research Culture Collection (NRRL) at 
1815 N. University Street, Peoria, Illinois 61604 U.S.A. on May 20. 1994, under the 
accession number NRRL B*21257. PiasmU pCIB3351 contains the EcoRI-PstI fragment of 
the phenazine gene cluster and has been deposited with the Agricultural Research Culture 
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Collection (NRRL) at 1815 N. University Street Peoria, Illinois 61604 U.S.A. on l^y 20, 
1994, under the accession number NRRL B-21258. pCIB3350 along vnlh pCIB3351 
comprises the entire phenazine gene of SEQ ID N0:17. Determination of the DNA 
sequence of the insert of pLSP18-6H3del3 revealed the presence of four ORFs within and 
adjacent to the critical 2,8 kb segment, 0RF1 (SEQ ID N0:18) was designated phzt 0RF2 
(SEQ ID N0:19) was designated phz2, and ORFS (SEQ ID NO:20) was designated phzS, 
and 0RF4 (SEQ ID N022) was designated phz4. The DNA sequence of ph24 is shown in 
SEQ ID N051. p/izt is approximately 1,35 kb in size and has homology at the 5' end to the 
entB gene of £. coli, which encodes isochorismatase. p/zz? is approximately 1.15 kb in size 
and has some homology at the 3' end to the trpG gene which encodes the beta subunit of 
anthranilate synthase. phz3 is approximately 0.85 kb in size. phz4 is approximately 0.65 kb 
in size and is homologous to the pdxH gene of £. coli which encodes pyridoxamine 5 - 
phosphate oxidase. 

Phenazine Detemiination 

Thomashow etal. (AppI Environ Mcrobiol 56: 908-912 (1990)) describe a method for the 
isolation of phenazine. This involves addifying culhires to pH 2.0 with HCI and extraction 
with benzene. Benzene fractions are dehydrated with Na2S04 and evaporated to dryness. 
The residue is redissolved In aqueous 5% NaHCOa, reextracted wth an equal volume of 
benzene, acidified, partitioned into benzene and redried. Phenazine concentrations are 
determined after fractionation by reverse-phase HPLC as described by Thomashow et al. 
(supra). 

G. Cloning Peptide Antipathoaenic Genes 

This group of substances is diverse and is classifiable into two groups: (1) those which are 
synthesized by enzyme systems without the participation of the ribosomal apparatus, and 
(2) those which require the ribosomally-mediated translation of an mRNA to provide the 
precursor of the antibiotic. 

Non-Ribosomal Peptide Antibiotics. 

Non-Ribosomal Peptide Antibiotics are assen^led by large, multifunctional enzymes which 
activate, modify, polymerize and in some cases cydize the subunit amino adds, fonming 
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polypeptide chains. Other acids, such as aminoadipic acid, diaminobutyric acid, 
diaminopropionic acid, dihydroxyamino acid, isoserine, dihydroxybenzoic acid, 
hydroxyisovaleric acid, (4R)-4-[(E)-2-butenyl]-4,N-dimethyl-L-threonine, and ornithine are 
also incorporated (Katz & Demain, Bacteriological Review 41: 449-474 (1977); Kleinkauf & 
von Dohren, Annual Review of Microbiology 41: 259-289 (1987)). The products are not 
encoded by any mRNA, and ribosomes do not directly participate in their synthesis. 
Peptide antibiotics synthesized non-ribosomally can in turn be grouped according to their 
general structures into linear, cyclic, lactone, branched cyclopeptide, and depsipeptide 
categories (Kleinkauf & von Dohren, European Journal of Biochemistry 192: 1-15 (1990)). 
These different groups of antibiotics are produced by the action of modifying and cycliztng 
enzymes; the basic scheme of polymerization is common to them all. Non-ribosomally 
synthesized peptide antibiotics are produced by both bacteria and fungi, and include 
edeine. linear gramicidin, tyrocidine and gramicidin S from Bacillus brevis, mycobaciliin from 
Bacillus subtilis, polymyxin from Bacillus polymiyxa, etamycin from Streptomyces griseus, 
echinomycin from Streptomyces echinatus, actinomycin from Streptomyces davuligerus, 
enterochelin from Escherichia coli, gamma-(alpha-L-aminoadipyl)-L-cysteinyi-[>-valine (ACV) 
from Aspergillus n/da/ans, alamethicine from Trichoderma viride, destruxin from Metarhizium 
anisolpliae, enniatin from Fusarium oxysporum, and beauvericin from Beauveria bassiana. 
Extensive functional and structural similarity exists between the proicaryotic and eul^ryotic 
systems, suggesting a common origin for both. The activities of peptide antibiotics are 
similariy broad, toxic effects of different peptide antibiotics in animals, plants, bacteria, and 
fungi are ioiown (Hansen, Annual Re^ew of Microbiology 47: 535-564 (1993); Katz & 
Demain, Bacteriological Reviews 41- 449-474 (1977); Kleinlcauf & von Dohren, Annual 
Review of Microbiology 41- 259-289 (1987); Kieini^uf & von Dohren, European Joumal of 
Biochemistry 192: 1-15 (1990); Kolter & Moreno, Annual Review of Microbiology 46: 141- 
163 (1992)). 

Amino acids are activated by tiie hydrolysis of ATP to fonn an adenylated amino or hydroxy 
acid, analogous to the charging reactions carried out by aminoacyl-tRNA synthetases, and 
Uien covalent thioester intermediates are fomned between the amino adds and the 
enzyme(s), either at specific cysteine residues or to a thiol donated by pantetheine. The 
amino add-dependent hydrolysis of ATP is often used as an assay for peptide antibiotic 
enzyme complexes (ishihara, etal., Joumal of Bacteriology 171 : 1705-1711 (1989)). Once 
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bound to the enzyme, activated amino acids may be modified before they are incorporated 
into the polypeptide. The most common modifications are epimerization of L-amino 
(hydroxy) acids to the D- form, N-acylations, cyclizations and N-methylations. 
Polymerization occurs through the participation of a pantetheine cofactor, which allows the 
activated subunits to be sequentially added to the polypeptide chain. The mechanism by 
which the peptide is released from the enzyme complex is important in the determination of 
the stnjctural class in which the product belongs. Hydrolysis or aminolysis by a free amine 
of the thiolester will yield a linear (unmodified or temninally aminated) peptide such as 
edeine; aminolysis of the thiolester by amine groups on the peptide itself will give either 
cyclic (attack by terminal amine), such as gramicidin S, or branched (attack by side chain 
amine), such as bacitracin, peptides; lactonization virith a temiinal or side chain hydroxy will 
give a lactone, such as destruxin, branched lactone, or cyciodepsipeptide, such as 
beauvericin. 

The enzymes which cany out these reactions are large multifunctional proteins, having 
molecular weights in accord with the variety of functions they perform. For example, 
gramicidin synthetases 1 and 2 are 120 and 280 kDa, respectively; ACV synthetase is 230 
kDa; enniatin synthetase is 250 kDa; bacitracin synthetases 1, 2, 3 are 335, 240, and 380 
kDa, respectivety (Katz & Demain, Bacteriological Reviews 41: 449-474 (1977); Kieinkauf & 
von Dohren, Annual Review of Microbiology 41: 259-289 (1987); Kieinkauf & von Dohren. 
European Journal of Biochemistry 192: 1-15 (1990). The size and complexity of these 
proteins means that relatively few genes must be doned in order for the capability for the 
complete nonribosomaJ synthesis of peptkie antbiotics to be transferred. Further, the 
functional and structural homology between bacterial and eukaryotic ^thetic systems 
indicates that such genes from any source of a peptkie antibiotic can be cloned using the 
available sequence infonnation» current functional information, and conventional 
microbiological techniques. The production of a fungiddal, insecticUal. or batericidal 
peptide antibiotic in a plant is expected to produce an advantage with respect to the 
resistance to agricultural pests. 

Example 1 9: Cloning of Gramicidin S Biosynthesis Genes 

Gramiddin S is a cyclic antibiotic peptide and has been shown to inhibit the germination of 
fungal spores (Munay, ef a/.. Letters in Applied Microbiology 3: 5-7 (1986)), and may 
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therefore be useful in the protection of plants against fungal diseases. The gramicidin S 
biosynthesis operon (grs) from Badllus brevis ATCC 9999 has been cloned and sequenced, 
including the entire coding sequences for gramicidin synthetase 1 (GS1, grsA), another 
gene in the operon of unknown function (gnsT), and GS2 (grsB) (Kratzschmar, et ai. 
Journal of Bacteriology 171 : 5422-5429 (1989); Krause, et a/., Journal of Bacteriology 162: 
1120-1125 (1985)). By methods well known in the art, pairs of PGR primers are designed 
from the published DNA sequence which are suitable for amplifying segments of 
approximately 500 base pairs from the grs operon using isolated Bacillus brevis ATCC 9999 
DNA as a template. The fragments to be amplified are (1) at the 3* end of the coding region 
of grsB, spanning the termination codon. (2) at the 5* end of the grsB coding sequence, 
including the initiation codon, (3) at the 3* end of the coding sequence of grsA, Including the 
temiination codon, (4) at the 5* end of the coding sequence of grsA, including the initiation 
codon, (5) at the 3' end of the coding sequence of grsT, including the termination codon, 
and (6) at the 5' end of the coding sequence oi grsT, including the initiation codon. The 
amplified fragments are radioactively or nonradioactively labeled by methods known in the 
art and used to screen a genomic library of Badlius brevis ATCC 9999 DNA constructed in 
a vector such as XEMBL3. The 6 amplified fragments are used in pairs to isolate cloned 
fragments of genomic DNA which contain intact coding sequences for the three biosynthetic 
genes. Clones which hybridize to probes 1 and 2 will contsun an intact grsB sequence, 
those which hybridize to probes 3 and 4 will contain an intact grsA gene, those which 
hybridize to probes 5 and 6 will contain an intact grsT gene. The cloned grsA is introduced 
into E coli and extracts prepared by lysing transformed bacteria through methods known in 
the art are tested for activity by ttie detemnination of phenylalanine-dependent ATP-PPj 
exchange (Krause, etaL, Joumal of Bacteriology 162: 1120-1125 (1985)) after removal of 
proteins smaller than 120 kDa by gel filtration chromatography. GrsB is tested similariy by 
assaying gel-filtered extracts from transformed bacteria for proline, valine, ornithine and 
leucine-dependent ATP-PPi exchange. 

Example 20: Cloning of Penicillin Biosynthesis Genes 

A 38 kb fragment of genomic DNA from Penicillium chrysogenum transfers the ability to 
synthesize penicillin to fungi, Aspergillus niger, and Neurospora crassa, which do not 
normally produce it (Smith, etal.. Bio/Technology 8: 39-41 (1990)). The genes which are 
responsible for biosynthesis, delta-(L-alpha-aminoadipyl)-L-cysteinyl-D-valine synthetase. 
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Isopenidllin N synthetase, and isopenicillin N acyltranferase have been individually cloned 
from P. chrysogenum and Aspergillus nidutans, and their sequences detemfiined (Ramon, et 
aL Gene 57: 17M81 (1987); Smith. etaL EMBO Joumal 9: 2743-2750 (1990); Tobin. et 
al., Joumal of Bacteriology 172: 5908-5914 (1990)). The cloning of these genes is 
accomplished by following the PCR-based approach described above to obtain probes of 
approximately 500 base pairs from genomic DNA from eWt\er Penidllium chrysogenum (for 
example, strain AS-P-78, from Antibioticos, S.A., Leon, Spain), or from Aspergillus nidulans 
for example, strain G69. Their integrity and function may be checked by treinsfomiing the 
non-producing fungi listed above and assaying for antibiotic production and individual 
enzyme activities as described (Smith, etal., Bio/Technology 8: 39-41 (1990)). 

Example 21 : Cloning of Bacitracin A Biosynthesis Genes 

Bacitracin A is a branched cyctopeptide antibiotic which has potential for the enhancement of 
disease resistance to bacterial plant pathogens. It is produced hy Bacillus licheniformis ATCC 
10716, and three multifunctional enzymes, bacitradn synthetases (BA) 1, 2. and 3, are 
required for its synthesis. The molecular weights of BA1, BA2. and BA3 are 335 kDa. 240 
kDa, and 380 kDa, respectively. A 32 kb fragment of Bacillus licheniformis DUA v\mich 
encodes the BA2 protein and part of the BA3 protein shows that at least these two genes are 
linked (Ishihara, et aL, Joumal of Bacteriology 171: 1705-1711 (1989)). Evidence from 
gramiddin S, penicillin, and surfactin biosynthetic operons suggest that the first protein in the 
pathway, BA1. will be encoded by a gene which is relatively dose to BA2 and BA3. BA3 Is 
purified by published methods, and it is used to raise an antibody in rabbits (Ishihara, et al. 
supra). A genomic library of Bacillus licheniformis DNA is transfonned into £. colimd dones 
which express antigenic detenninants related to BA3 are detected by methods known In the 
art. Because BA1 , BA2, and BA3 are antigenically related, the detection method will provkie 
clones encoding each of the three enzymes. The kientity of each done is confirmed by 
testing extracts of transformed E. coB for the appropriate amino acid-dependent ATP-PPi 
exchange. Clones encoding BA1 will e^AiM leudne-, glutamic add-, and isoieudne- 
dependent ATP-PPi exchange, those encoding BA2 will exhibit lysine- and omlthine- 
dependent exchange, and those encoding BA3 will exhibit isoleudne, phenylalanine-, 
histidine-, aspartic add-, and asparagine-dependent exchange. If one or two genes are 
obtained by this metiiod, the others are isolated by techniques known in the art as "V/alking" 
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or "Chromosome walking" techniques (Sambrook et al, in: Molecular Cloning: A Laboratory 
Manual, Cold Spring Harbor Labroatory Press. 1989). 

Example 22: Cloning of Beauvericin and Destruxin Biosynthesis Genes 
Beauvericin is an insecticidal hexadepsipeptlde produced by the fungus Beauveria 
bassiana (Kleinkauf & von Dohren. European Journal of Biochemistry 192: 1-15 (1990)) 
which will provide protection to plants from insect pests. It is an analog of enniatin, a 
phytotoxic hexadepsipeptide produced by some phytopathogenic species of Fusarium 
(Bunneister & Plattner, Phytopathology 77: 1483-1487 (1987)). Destruxin is an insecticidal 
lactone peptide produced by the fungus Metarhizium anisopliae (James, et aL Journal of 
Insect Physiology 39: 797-804 (1993)). Monoclonal antibodies directed to the region of the 
enniatin synthetase complex responsible for N-methylation of activated amino acids cross 
react with the synthetases for beauvericin and destruxin, demonstrating their structural 
relatedness (Kleinkauf & von Dohren, European Journal of Biochemistry 192: 1-15 (1990)). 
The gene for enniatin synthetase gene (esynl) from Fusarium sc//p/ has been cloned and 
sequenced (Haese, et aL, Molecular Microbiology 7: 905-914 (1993)), and the sequence 
information is used to carry out a cloning strategy for the beauvericin synthetase and 
destruxin synthetase genes as described above. Probes for ttie beauvericin synthetase 
(BE) gene and the desmjxin synthetase (DXS) gene are produced by amplifying specific 
regions of Beauveria bassiana genomic DMA or Metariituum anisopliae genomic DNA using 
oligomers whose sequences are taken from the enniatin synthetase sequence as PCR 
primers. Two pairs of PCR primers are chosen, witti one pair capable of causing the 
amplification of the segment of the BE gene spanning the initiation codon, and the other 
pair capable of causing the amplification of the segment of the BE gene which spans the 
temiination codon. Each pair will cause the production of a DNA fragment which is 
approximately 500 base pairs in size. Library of genomic DNA from Beauveria t)assiana 
and Metartiizium anisopliae are probed with the labeled fragments, and clones which 
hybridize to both of them are chosen. Complete coding sequences of beauvericin 
synthetase will cause the appearance of phenylalanine-dependent ATP-PPt exchange In an 
appropriate host and that of destruxin will cause the appearance of valine-, isoleudne-. and 
alanine-dependent ATP-PPj exchange. Extracts from these transfomied organisms will 
also canry out the cell-free biosynthesis of beauveridn and destruxin, respectively. 
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Example 23: Cloning genes for the Biosynthesis of an Unknown Peptide Antibiotic 
The genes for any peptide antibiotic are cloned by the use of consen/ed regions within the 
coding sequence. The functions common to all peptide antibiotic synthetases, that is, 
amino acid activation, ATP-, and pantetheine binding, are reflected in a repeated domain 
structure in which each domain spans approximately 600 amino acids. Within the domains, 
highly conserved sequences are known, and it is expected that related sequences will exist 
in any peptide antibiotic synthetase, regardless of its source. Hie published DMA 
sequences of peptide synthetase genes, including gramicidin synthetases 1 and 2 (Hori, et 
aL, Journal of Biochemistry 106: 639-645 (1989); Krause, et a/., Journal of Bacteriology 
162 : 1120-1125 (1985); Turgay, etaL, Molecular Microbiology 6: 529-546 (1992)), tyrocidine 
sythethase 1 and 2 (Weckemiann, etaL, Nucleic Adds Research 16: 11841 (1988)). ACV 
synthetase (MacCabe, et al.. Journal of Biological Chemistry 266: 12646-12654 (1991)), 
enniatin synthetase (Haese, ef a/.. Molecular Microbiology 7: 905-914 (1993)), and surfactin 
synthetase (Fuma, etal.. Nucleic Acids Research 21: 93-97 (1993); Grandi, etaL, Eleventh 
International Spores Conference (1992)) are compared and the individual repeated domains 
are identified. The domsuns from ail the synthetases are compared as a group, and the 
most highly consented sequences are identified. From these consented sequences, DNA 
oligomers are designed which are suitable for hybridizing to all of the observed variants of 
the sequence, and another DNA sequence which lies, for example, from 0.1 to 2 kilobases 
away from the first DNA sequence, is used to design another DNA oligomer. Such pairs of 
DNA oligomers are used to amplify by PCR the intervening segment of the unloiown gene 
by combining them with genomic DNA prepared from the organism which produces the 
antibiotic, and following a PCR amplification procedure. The fragment of DNA which is 
produced is sequenced to confinn its identity, and used as a probe to identify dones 
containing larger segments of the peptide synthetase gene in a genomic library. A variation 
of this approach, in which the oligomers designed to hybridize to the conserved sequences 
in the genes were used as hybridization probes themselves, rather than as primers of PCR 
reactions, resulted in the identification of part of the surfadin synthetase gene from Badllus 
subVlis ATCC 21332 (Borchert. ef a/., FEMS Microbiological Letters 92: 175-180 (1992)). 
The doned genomic DNA which hybridizes to the PCR-generated probe is sequenced, and 
the complete coding sequence is obtauned by %valldng" procedures. Such "walking" 
procedures will also yield other genes required for the peptide antibiotic synthesis, because 
they are known to be dusterad. 



wo 95/33818 



PCT/IB95/0a414 



-64- 



Another method of obtaining the genes which code for the synthetase(s) of a novel peptide 
antibiotic is by the detection of antigenic detemiinants expressed in a heterologous host 
after transformation with an appropriate genomic library made from DNA from the antibiotic- 
producing organism. It is expected that the common structural features of the synthetases 
will be evidenced by cross-reactions with antibodies raised against different synthetase 
proteins. Such antibodies are raised against peptide synthetases purified from known 
antibiotic-producing organisms by known methods (Ishihara, et aL, Journal of Bacteriology 
171: 1705-1711 (1989)). Transformed organisms bearing fragments of genomic DNA from 
the producer of the unknown peptide antibiotic are tested for the presence of antigenic 
detenninants which are recognized by the anti-peptide synthetase antisera by methods 
known in the art. The cloned genomic DNA canned by cells which are identified by the 
antisera are recovered and sequenced. "Walking" techniques* as described eariier, are 
used to obtain both the entire coding sequence and other biosynthetic genes. 

Another method of obtaining the genes which code for the ^thetase of an unknown 
peptide antibiotic is by the purification of a protein which has the characteristics of the 
appropriate peptide synthetase, and determining all or part of its amino add sequence. The 
amino acids present in the antibiotic are determined by first purifying it from a chloroform 
extract of a culture of the antibiotic-producing organism, for example by reverse phase 
chromatography on a Ci6 column in an ethanohwater mixture. The composition of the 
purified compound is determined by mass spectrometry. NMR, and analysis of tiie products 
of add hydrolysis. The amino or hydroxy adds present in the peptide antibiotic will produce 
ATP-PPj exchange when added to a peptide-synttietase-containing extract from the 
antibiotic-produdng oipanism. This reaction Is used as an assay to detect the presence of 
the peptide synthetase during tiie course of a protein purification scheme, such as are 
known in the art. A substantially pure preparation of the peptide synthetase is used to 
determine its amino acid sequence, eitfier by the direct sequendng of tiie intact protein to 
obtain the N-terminal amino acid sequence, or by the production, purification, and 
sequendng of peptides derived from the intact peptide syntiietase by the action of spedfic 
proteolytic enzymes, as are known in the art A DNA sequence is inferred from the amino 
add sequence of the synthetase, and DNA oligomers are designed whtdi are capable of 
hybridizing to such a coding sequence. The oligomers are used to probe a genomic library 
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made from the DMA of the antlbJotio-producing ofganism. Selected clones are sequenced 
to identify them, and complete coding sequences and associated genes required for 
peptide biosynthesis are obtained by using ValWng" techniques. Extracts from organisms 
which have been transformed with the entire complement of peptide biosynthetic genes, for 
example bacteria or fungi, will produce the peptide antibiotic when provided with the 
required amino or hydroxy adds. ATP, and pantetheine. 

Further methods appropriate for the cloning of genes required for the synthesis of non- 
ribosomal peptide antibiotics are described in Section B of the examples. 

Ribosomallv-Svnthesized Peptide Antibiotics, 

Ribosomally-Synthesized Peptide Antibiotics are characterized by the existence of a 
structural gene for the antibiotic itself, which encodes a precursor that is modified by 
specific enzymes to create the mature molecule. The use of the gener^ protein synthesis 
apparatus for peptide antibiotic synthesis opens up the possibility for much longer polymers 
to be made, although these peptide antibiotics are not necessarily very large, in addition to 
a structural gene, furflier genes are required for extracellular secretion and immunity, and 
these genes are believed to be located close to tiie structural gene, in most cases probably 
on the same operon. Two major groups of peptide antibiotics made on ribosomes extet: 
those which contain the unusual amino add lanttitonine, and those which do not. 
Lantiiionine-containing antibiotics (lantibiotics) are produced by gram-positive bacteria, 
including spedes of Lactococois, StaphylococGUS, Streptococ&js. Bacillus, and 
Streptomyces. Unear lantibiotics (for example, nisin, subtiiin, epidermin. and galiidermin), 
and drcuiar lantibiotics (for example, duramydn and dnnamydn), are known (Hansen, 
Annual Review of Microbiology 47: 535-564 (1993); Kolter & Moreno, Annual Review of 
l^crobiology 46: 141-163 (1992)). i^tibiotics often contain other characteristic modified 
residues such as dehydroalanine (DHA) and dehydrobutyrine (DHB), yAiich are derived 
from the dehydration of serine and threonine, respectively. The reaction of a tiiiol from 
cysteine witii DHA yields lantiitonine, and witti DHB yields p-metiiyllanttiionine. Peptide 
antibiotics which do not contain lanthionine may contain other modifications, or they may 
consist only of the ordinary amino adds used in protein syntiiesis. Non-lanthionine- 
containing peptide antibiotics are produced by both gram-positive and gram-negative 
bacteria, induding Lactobadllus, Lactococcus, Pediococcus, Enterococcus, and 
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Escherichia, Antibiotics In tills category Include lactacins. lactocins, sakacin A, pediocins, 
diplococcin, lactococcins, and microclns (Hansen, supra; Kolter & Moreno, supra). In 
general, peptide antibiotics wtiose synthesis is begun on ribosomes are subject to several 
types of post-transiational processing, including proteolytic cleavage and modification of 
amino acid side chains, and require the presence of a specific transport and/or immunity 
mechanism. The necessity for protection from tiie effects of these antibiotics appears to 
contrast strongly with the lack of such systems for nonribosomal peptide antibiotics. This 
may be rationalized by considering that ttie antibiotic activity of many ribosomally- 
synthesized peptide antibiotics is directed at a narrow range of bacteria which are fairiy 
closely related to the producing organism. In this situation, a particular method of 
distinguishing the producer from tiie competitor is required, or else the advantage Is lost. . 
As antibiotics, this property has limited tiie usefulness of this class of molecules for 
situations In which a broad range of activity if desirable, but enhances their attractiveness In 
cases when a very limited range of activities is advantageous. In eukaryotic systems, which 
are not known to be sensitive to any of this type of peptide antibiotic, it is not dear if ; 
production of a ribosomally-synthesized peptide antibiotic necessitates one of these' 
transport systems, or if transport out of the cell is merely a matter of placing the antibiotic In 
a better location to encounter potential pathogens. This question can be addressed 
experimentally, as shown in the examples whidi follow. 

Example 24: Cloning Genes for the Biosynthesis of a Untibiotic 

Examination of genes linked to the stmctural genes for the lantibiotics nisin, subfilin, and 
epidemnin show several open reading frames whidi share sequence homology, and ttie 
predicted amino acid sequences suggest functions which are necessary for the maturation 
and transport of the antibiotic. The spa genes of Badilus subtilis ATCC 6633, induding 
spaS, ttie structural gene encoding the precursor to subtilin, have been sequenced (Chung 
& Hansen, Joumal of Bacteriology 174: 6699^702 (1992); Chung, et al.. Journal of 
Bacteriology 174: 1417-1422 (1992); Klein, et a/., Applied and Environmental Microbiology 
58: 132-142 (1992)). Open reading frames were found only upstream of at least 
wittiin a distance of 1 -2 kilobases. Several of the open reading frames appear to part of the 
same transcriptional unit, spaE, spaD. spaB, and spaC. with a putative promoter upstream 
of spaE. Both spaB, which encodes a protein of 599 amino adds, and spaO, which 
encodes a protein of 177 amino adds, share homology to genes required for the transport 
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of hemolysin, coding for the HylB and HlyD proteins, respectively. SpaE, which encodes a 
protein of 851 amino acids, is homologous to n/sS, a gene linked to the structural gene for 
nisin, for which no function is known. SpaC codes for a protein of 442 amino acids of 
unknown function, but dismption of it eliminates production of subtilin. These genes are 
contained on a segment of genomic DNA which is approximately 7 kilobases in size (Chung 
& Hansen, Journal of Bacteriology 174^ 6699-6702 (1992); Chung, et al., Journal of 
Bacteriology 174: 1417-1422 (1992); Klein, et al.. Applied and Environmental ^/Bcrobiology 
58: 132-142 (1992)). It has not been clearly demonstrated if these genes are completely 
sufficient to confer the ability to produce subtilin. A 13.5 kilobasepair (kb) fragment from 
plasmid TQ32 of Staphylococcus epidermis Tu3298 containing the stmctural gene for 
epidermin (epiA), also contains five open reading frames denoted epiA, epiB, epiC, ep!D, 
epiQ, and epiP. The genes epiBC are homologous to the genes spaBC, while epiO 
appears to be involved in tiie regulation of the expression of the operon, and epiP may 
encode a protease which acts during the maturation of pre-epidemiin to epidermin. EpiD 
encodes a protein of 181 amino adds which binds the coenzyme flavin mononucleotide, 
and is suggested to perform post-translational modification of pre-epidermin (Kupke, et al., 
Journal of Bacteriology 174: (1992); Peschel, etal.. Molecular ^ficrob^ology 9: 31-39 (1993); 
Schnell, et al., European Journal of Biochemistry 204 : 57-68 (1992)). It is expected that 
many, if not all, of the genes required for the biosynthesis of a lantibiotic will be clustered, 
and physically dose together on eittier genomic DNA or on a plasmid, and an approach 
which allows one of the necessary genes to be located wrill be useful in finding and cloning 
the others. The structural gene for a lantibiotic is doned by designing oligonudeotide 
probes based on the amino add sequence detennined from a substantially purified 
preparation of the lantibiotic itself, as has been done wth the (antibiotics lactidn 481 from 
Lactococcus lactis subsp. lactis CNRZ 481 (Piard, ef a/., Journal of Biological Chemistry 
268 : 16361-16368 (1993)), streptococcin A-FF22 from Streptococcus pyogenes FF22 
(Hynes, et al.. Applied and Environmental Microbiology 59: 1969-1971 (1993)), and 
salivaricin A from Streptococcus salivarius 203P (Ross, et al.. Applied and Environmental 
Microbiology 59: 2014-2021 (1993)). Fragments of bacterial DNA approximately 10-20 
kilobases in size containing the stmctural gene are cloned and sequenced to determine 
regions of homology to the characterized genes in the spa, epi, and nis operons. Open 
reading frames which have homology to any of these genes or which lie in the same 
transcriptional unit as open reading frames having homology to any of these genes are 
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cloned individually using techniques known in the art. A fragment of DNA contaimng all of 
the associated reading frames and no others is transfom^ed into a non-producng strain of 
bacteria, such as Esherichia coli. and the production of the lantlbiotic analyzed, in cider to 
demonstrate that all the required genes are present. 

Example 25: Cloning Genes for the Biosynthesis of a Non-Lanthlonlne Containing, 

Ribosomally Synthesized Peptide Antibiotic 
The lacl^ of the extensive modifications present in lantlbiotics is expected to reduce the 
number of genes required to account for the complete synthesis of peptide antibiotics 
exemplified by lactacin F. sakacin A. lactococcin A. and helveticin J. Clustered genes 
involved in the biosynthesis of antibiotics were found in Lactobacillus JohnsonS WPlUOea. 
for lactacin F (Fremaux. et al.. Applied and Environmental Microbiology 3906^915 
(1993)). in Lactobacillus sake Lb706 for sakacin A (Axelsson. at al.. Applied and 
Environmental Microbiology 59: 2868-2875 (1993)). in Lacfococa/s factfs for lactooocdn A 
(Stoddard, et al.. Applied and Environmental Microbiology 58: 1952-1961 (1992)). and in 
Pediococcus acidilactici for pediodn PA-1 (Mamgg. et al.. Applied and Environmental 
Microbiology. 58: 2360-2367 (1992)). The genes required for the biosynthesis of a novel 
non-lanthionine-containing peptide antibiotic are cloned by first detemiining the amino acid 
sequence of a substantially purified preparation of the antibiotic, designing Dl^ oligomers 
based on the amino acid sequence, and probing a DNA library constmcted from either 
genomic or plasmid DNA from the producing bacterium. Fragments of DNA of 5-10 
kilobases which contain the stnictural gene for the antibfotic are cloned and sequenced. 
Open reading frames which have homology to sakB from Lactobacillus sake, or to lafX. 
ORFY. or ORFZ from LBCtobacUlus johnsonii. or which are part of the same transcriptional 
unit ai the antibiotic structural gene or genes having homology to those genes previously 
mentioned are individually cloned by methods known in the art A fragment of DNA 
containing all of the associated reading frames and no others is transfonned into a non- 
producing strain of bacteria, such as Esherichia coH.. and the preductlon of the antibiotic 
analyzed, in order to demonstrate that all the required genes are present 
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Example 26: Overexpresslon of APS Blosynthetic Genes for Overproduction of APS 

using Fermentation-Type Technology 
The APS biosynthetic genes of this Invention can be expressed In heterologous organisms 
for the purposes of their production at greater quantities than might be possible from their 
native hosts. A suitable host for heterologous expression is £ cott and techniques for gene 
expression In £ coli are well known. For example, the cloned APS genes can be 
expressed in £ coli using the expression vector pKK223 as dea:ribed In example 1 1 . The 
cloned genes can be fused in transcriptional fusion, so as to use the available ribosome 
binding site cognate to the heterologous gene. This approach facilitates the expression of 
operons which encode more than one open reading frame as translation of the Individual 
ORFs win thus be dependent on their cognate ribosome binding site signals. Altemabvely 
APS genes can be fused to the vector's ATG (e.ff. as an Wco/ fusion) so as to use the £ 
coli ribosome binding site. For multiple ORF expression In £ coB {e^. in the case of 
operons with multiple ORFs) this type of constnict would require a separate promoter to be 
fused to each ORF. It is possible, however, to fuse the first ATG of the APS operon to the 
E. coli ribosome binding site while requiring the other ORFs to utilize their cognate ribosome 
binding sites. These types of constmctlon for the overexpression of genes in £ coli are 
well known In the art. Suitable bacterial promoters Include the tec promoter, the tac {trp/lac) 
promoter, and the PX promoter from bacteriophage >.. Suitable commeroially available 
vectors Include, for example. pKK223-3. pKK233-2. pDR540. pDR720. pYEJOOl and pPL- 
Lambda (from Phamiacia. Piscataway. NJ). 

Similariy. gram positive bacteria, notably Bacillus species and partlculariy Bacillus 
lichenifomis, are used In commercial scale production of heterologous proteins and can be 
adapted to the expression of APS biosynthetic genes (e.g. Quax et aL. In: Industnal 
Microorganisms: Basic and Applied Molecular Genetics. Eds.: Ballz ef al., American Society 
for Microbiology. Washington (1993)). Regulatory signals from a highly expressed Baallus 
gene (e.p. amylase promoter. Quax ef aL. supra) are used to generate transcriptional 
fusions with the APS biosynthetic genes. 
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,n some Instances, high level expression of bacterial genes has been achieved using yeas 
systems such as the methylotrophic yeast PfcWa pastoris (Sreekrishna. in: Industnal 
microorganisms: basic and applied molecular genetics. Baltz. Hegeman. and Skatrud eds. 
American Society for Microbiology. Washington (1993)). The APS gene(s) of interest are 
positioned behind 5- regulatory sequences of the Pichia alcohol oxidase gene In vectors 
such as pHIL-DI and pHlL-D2 (Sreekrishna. supra). Such vectors are used to transfom, 
Pichia and introduce the heterologous DNA Into the yeast genome. Ukewise. the yeast 
Saccharomyces cerevisiae has been used to express heterotogous bacterial genes (e.p. 
Dequin & Barre. Biotechnology 12:173-177 (1994)). The yeast Kluyveromyces /acteis also 
a suitable host for heterologous gene expression {e.g. van den Berg et al.. Biotechnology 
8:135-139 (1990)). 

Overexpression of APS genes in organisms such as £. coli. Bacillus and yeast, which are 
known for their rapid growth and multiplication, will enable femientation-produclion of larger 
quantities of APSs. The choice of organism may be restricted by the possible suscepfbilrty 
of the organism to the APS being overproduced; however, the likely susceptibifity can be 
detemiined by the procedures outlined in Section J. The APSs can be Isolated and punhed 
from such cultures (see -^S") for use in the control of microorganisms such as fung. and 
bacteria. 

L cv p^ccon of Antlb l >^>- «.o«»nth^c Gen>»s In Microbial Hosts for Blocontrol 
Purposes 

The cloned APS blosynthetic genes of this invention can be utilized to mciease the efficacy 
of blocontrol strains of various microorganisms. One possibility is the transfer of the genes 
for a particular APS back into its native host under stronger transcriptional regulation to 
cause the production of laiger quantities of the APS. Another possibility is the transfer of 
genes to a heterologous host, causing production In the heterologous host of an APS not 
normally produced by that host 

Miaoorganisms which are suitable for the heterologous overexpression of APS genes are 
all microorganisms which are capable of colonizing plants or the rhizosphere. As such they 
will be brought into contact with phytopathogenic fungi causing an Inhibition of their growth 
These include gram-negative microorganisms such as Pseiittomonas. EntB^bacter Bn6 
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Sermtia. the gram-positive microorganism Bacillus and Streptomyces spp. and the fungi 
Trichodermaand GHodadium. Particularly preferred heterologous hosts are Pseudomonas 
nuorescens. Pseudomonas putida, Pseudomonas cepacia. Pseudomonas aureofaciens. 
Pseudomonas auranSaca, Entembacter cloacae. Serratia marscesens. Bacillus subWis. 
Badllus cereus, Trichodenm wide, Trichodenna hamanum and GHodadium wrens. 

Example 27: Expression of APS Biosynthetic Genes In £ eoli and Other Gram- 
Negative Bacteria 

Many genes have been expressed in gram-negative bacteria in a heterologous manner. 
Example 11 describes the expression of genes for pyrrolnitrin biosynthesis in £. coff using 
the expression vector pKK223-3 (Pharmacia catalogue # 27-4935-01). This vector has a 
strong tac promoter (Brosius, J. ef a/.. Proc. Natl. Acad. ScL USA 81) regulated by the lac 
repressor and induced by IPTG. A number of other expresaon systems have been 
developed for use in E. coli and some are detailed in Examples 14-17 above. The 
thennoinduclbie expression vector pPu (Phamiacia #27-4946-01) uses a tightly regulated 
bacteriophage X promoter which allows for high level expression of proteins. The bo 
promoter provides another means of expression but the promoter is not expressed at such 
high levels as the tac promoter. With the addition of broad host range replicons to some of 
these expression system vectors, production of antihingal compounds in closely related 
gram negative-bacteria such as P^udonuums, Entembactsr, Serratia and En^ia is 
possible. For example. pLRKD21 1 (Kaiser & Kroos. Proc. Natl. Acad. ScL USA 81: 5816- 
5820 (1984)) contains the broad host range replicon ori 7 which allows replication In many 
gram-negative bacteria. 

In £ coli, induction t»y IPTG is required for expression of the tac {Le. trp^d^ promoter. 
When this same promoter (e.g. on wide-host range plasmld pLRKD21 1) is introduced into 
Pseudomonas it is constitutively active without Induction by IPTG. This trp-lac promoter can 
be placed in front of any gene or operon of interest for expression in Pseudomonas or any 
other closely related bacterium for the purposes of the constitutive expression of such a 
gene. If the operon of interest contains the infonnation for the biosynthesis of an APS, then 
an othenwise biocontrol-minus strain of a gram-negative bacterium may be able to protect 
plants against a variety of fungal diseases. Thus, genes for antihingal compounds can 
therefore be placed behind a strong constitutive promoter, transferred to a bacterium that 
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normally does not produce antifungal products and which has plant or rhizosphere 
colonizing properties turning these organisms into effective biocontrol strains. Other 
possible promoters can be used for the constitutive expression of APS genes in gram- 
negative bacteria. These include, for example, the promoter from the Pseudomonas 
regulatory genes gafA and lemA (WO 94/01561) and the Pseudomonas savastanoi lAA 
operon promoter (Gaff ney et al.. J. Bacteriol. 1 72: 5593-5601 (1 990). 

The synthetic Pm operon with the tac promoter as described in example 11a was inserted 
into two broad host range vectors that replicate in a wide range of Gram negative bacteria. 
The first vector. pRK290 (Ditta et al 1980. PNAS 77(12) pp. 7347-7351). is a low copy 
number plasmid and the second vector. pBBRIMCS (Kovach et al 1994, Biotechnfc|ues 
1 6(5):800-802), a medium copy number plasmid. Constnjcts of both vectors containing the 
Pm genes were introduced into a number of Gram negative bacterial strains and assayed 
for production of Pyrrolnitrin by TLC and HPLC. A number of strabts were shown to 
heterologously produce Pyrrolnitim. These Include E.coli, Pseudomonas sp. (MOCG133, 
MOCG380, MOCG382, BL897. BL1889, Bl-2595) and EntBrobactertaytorae{tAOCG206). 

Example 28: Expression of APS Biosynthetic Genes in Gram-Positive Bacteria 

Heterologous expression of genes encoding APS genes in gram-positive bacteria is another 
means of producing new biocontrol strains. Expression systems for Badllus and 
Streptomyces are the best characterized. The promoter for the erytliromycin resistance 
gene (ernifl) from Streptococcus pneumoniae has been shown to be active 'm gram-positive 
aerobes and anaerobes and also In £co// (Trieu-Cuot ef aL, Mud Adds Res Ig: 3660 
(1 990)). A further antibiotic resistance promoter from the thiostreptone gene has been used 
in Streptomyces doning vectors (Bibb. Mol Gen Genet 199: 26-36 (1985)). The shuttle 
vector pHT3101 is also appropriate for expression in BacBlus (Ljeredus. FEMS Microbiol 
Lett 60: 211-218 (1989)). By expressing an operon (such as the pyrrolnitrin operon) or 
individual APS encoding genes under control of the ennR or other promoters It will be 
possible to convert soil badlli Into strains able to protect plants against microbial diseases. 
A significant advantage of this approadi is that many gram-positive bacteria produce 
spores which can be used in formulations that produce biocontrol products wdth a longer 
shelf life. Badllus and Streptomyces spedes are aggressive colonizers of soils. In fact 
both produce secondary metabolites induding antilrfotics active against a broad range of 
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organisms and the addition of heterologous antifungal genes including (including those 
encoding pyarolnitrin. soraphen, phenazine or cyclic peptides) to gram-positive bacteria may 
make these organisms even better biocontrol strains. 

Example 29: Expression of APS Biosynthetic Genes in Fungi 

Trichoderma harzianum and Gliocladium virens have been shown to provide varying levels 
of biocontrol in the field (US 5,165.928 and US 4,996.157. both to Cornell Research 
Foundation). The successful use of these biocontrol agents will be greatly enhanced by the 
development of improved strains by the introduction of genes for APSs. This could be 
accomplished by a number of ways which are well known in the art. One is protoplast 
mediated transformation of the fungus by PEG or electroporation-mediated techniques. 
Alternatively, particle bombardment can be used to transform protoplasts or other fungal 
cells with the ability to develop into regenerated mature staictures- The vector pAN7-1 , 
originally developed for Aspergillus transformation and now used widely for fungal 
transfomiation (Cunagh et al., MycoL Res. 97f3;: 313-317 (1992;; Tooiey et al., Curr. 
Genet 27:55-60 (1992); Punt etaL, Gene 56: 117-124 (1987)) Is engineered to contain the 
pyrrotnitrin operon, or any other genes for APS biosynthesis. This plasmid contains the E. 
CO// the hygromycin B resistance gene flanked by the Aspergillus nidulans flpd promoter and 
the trpC terminator (Punt et al., Gene 56: 1 1 7-1 24 (1 987)). 

J. In Vitro Activity of Anti-Dhvtopathoaenic Substances Against Plant Pathogens 

Example 30: Bioassay Procedures for the Detection of Antifungal Activity 

Inhibition of fungal growth by a potential antifungal agent can be detenmined in a nundDer of 
assay formats. Macroscopic methods which are commonly used indude the agar diffusion 
assay (Dhingra & Sinclair, Basic Plant Pathology Mettiods. CRC Press. Boca Raton, FLA 
(1985)) and assays in liquid media (Broekaert et a/.. FEMS l^crobioi. Lett. 69: 55- 
60.(1990)). Both types of assay are perfonned with either fungal spores or mycelia as 
inocuia. The maintenance of fungal stocks is In accordance with standard mycological 
procedures. Spores for bioassay are han^ested from a mature plate of a fungus by flushing 
the surface of the culture with sterile water or buffer. A suspension of mycelia is prepared 
by placing fungus from a plate in a blender and homogenizing until the colony is dispersed. 
The homogenate is filtered through several layers of cheesecloth so that larger particles are 
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excluded. The suspension which passes through the cheesecloth is washed by 
centrifugation and replacing the supernatant with fresh buffer. The concentration of the 
mycelial suspension is adjusted empirically, by testing the suspension in the bioassay to be 
used. 

Agar diffusion assays may be performed by suspending spores or mycelial fragments in a 
solid test medium, and applying the antifungal agent at a point source, from which it 
diffuses. This may be done by adding spores or mycelia to melted fungal growth medium, 
then pouring the mixture into a sterile dish and allowing it to gel. Sterile filters are placed on 
the surface of the medium, and solutions of antifungal agents are spotted onto the filters. 
After the liquid has been absorbed by the filter, the plates are Incubated at the appropriate 
temperature, usually for 1-2 days. Growth inhbition is indicated by the presence of zones 
around filters In which spores have not germinated, or in which mycelia have not grown. 
The antifungal potency of the agent, denoted as the minimal effective dose, may be 
quantified by spotting serial dilutions of the agent onto filters, and detemfiining the lowest 
dose which gives an obsen/able inhibition zone. Another agar diffusion assay can be 
perfomied by cutting wells into solidified fungal growth medium and placing solutions of 
antifungal agents into them. The plate is inoculated at a point equidistant from all the wells, 
usually at the center of the plate, with elttier a small aliquot of spore or mycelial suspension 
or a mycelial plug cut directiy fn^m a stock culture plate of the fungus. The plate is 
incubated for several days until the growing mycelia approach the wells, then it is observed 
for signs of growth inhibition. Inhibition is indicated by tiie deformation of the roughly 
circular fomn which the fungal colony nomially assumes as it grows. Specif ically, if the 
mycelial front appears flattened or even concave relative to the uninhibited sections of the 
plate, growtfi inhibition has occunred. A minimal effective concentration may be determined 
by testing diluted solutions of tiie agent to find the lowest at which an effect can be 
detected. 

Bioassays in liquid media are conducted using su^ensions of spores or mycelia which are 
incubated in liquid fungal growtti media instead of solid media. The fungal inocula, medium, 
and antifungal agent are mixed in wells of a 96-well microtiter plate, and the growtii of tiie 
fungus is followed by measuring the turbidity of tiie culture spectrophotometricaily. 
Increases in turbidity correlate witti increases In biomass, and are a measure of fungal 
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growlh. Growth inhibition is detemiined by comparing the growth of the fungus in the 
presence of the antifungal agent with growth in its absence. By testing diluted solutions of 
antifungal inhibitor, a minimal inhibitory concentration or an EC50 may be detemiined. 

Example 31 : Bioassay Procedures for the Detection of Antibacterial Activity 
A number of bioassays may be employed to detemiine ttie antibacterial activity of an 
unknown compound. The inhibition of bacterial growth in solid media may be assessed by 
dispersing an inoculum of the bacterial culture in melted medium and spreading the 
suspension evenly in the bottom of a sterile Petri dish. After the medium has gelled, sterile 
filter disks are placed on the surface, and aliquots of the test material are spotted onto 
them. The plate is incubated overnight at an appropriate temperature, and growth inhibition 
is observed as an area around a filter in which the bacteria have not grown, or in which the 
growth is reduced compared to the sun-ounding areas. Pure compounds may be 
characterized by the detemiination of a minimal effective dose, the smallest amount of 
material which gives a zone of inhibited growth. In liquid media, two other methods may be 
employed. The growth of a culture may be monitored by measuring the optical density of 
the culture, in actuality the scattering of incident light Equal tnocula are seeded into equal 
culture volumes, with one culture containing a known amount of a potential antibacterial 
agent After incubation at an appropriate temperature, and with appropriate aeration as 
required by the bacterium being tested, ttie optical densities of tiie cultures are compared. 
A suitable wavelength for the comparison is 600 nm. The antibacterial agent may be 
characterimd by the determination of a minimal effective dose, the smallest amount of 
material which produces a reduction in tiie density of ttie culture, or by determining an 
EC50. the concentration at which the growtti of the test culture is half ttiat of the control 
The bioassays described above do not differentiate between bacteriostatic and 
bacterioddal effects. Anotifier assay can be performed whidi will detemnine the 
bacteriocidal activity of the agent This assay is carried out by kicubating the bacteria and 
ttie active agent togettier in liquid medium for an amount of time and under conditions viAiich 
are sufficient for the agent to exert its effect After ttiis incubation is completed, tfie bacteria 
may be eitifier washed by centrifugation and resuspension, or diluted by the addition of 
fresh medium. In eitiier case, the concentration of tiie antibaderial agent is reduced to a 
point at which it is no longer expected to have significant activity. The bacteria are plated 
and spread on solid medium and tifie plates are incubated overnight at an appropriate 
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temperature for growth. The number of colonies which arise on the plates are counted, and 
the number which appeared from the mixture which contained the antibacterial agent is 
compared with the number which arose from the mixture which contained no antibacterial 
agent. The reduction in colony-forming units is a measure of the bacteriocidai activity of the 
agent. The bacteriocidal activity may be quantified as a minimal effective dose, or as an 
EC5o> as described above. Bacteria which are used in assays audi as these include 
species of Agrobacterium, Erwinia, Cla]^acter, XanUiomonas, and Pseudomonas. 

Example 32: Antipathogenic Activity Determination of APSs 

APSs are assayed using the procedures of examples 30 and 31 above to identify the range 
of fungi and bacteria against which they are active. The APS can be isolated from the cells 
and culture medium of the host organism nomially producing it» or can alternatively be 
isolated from a heterologous host which has been engineered to produce the APS. A 
further possibility is the chemical synthesis of APS compounds of known chemical structure, 
or derivatives thereof. 

Example 33: Antimicriobial Activity Determination of Pyrrolnitrin 

a) The anti*phytopathogenic activity of a fluorinated 3-cyano-derivative of pyrrolnitrin 
(designated CGA1 73506) was obsen/ed ag^nst the m^e fungal phytopathgens Dpiodia 
maydis, Colletotrichum graminicola, and GibbereUa zeae-maydis. Spores of the fungi were 
harvested and suspended in water. Approximately 1000 spores were inoculated into potato 
dextrose broth and either CGA1 73506 or water in a total volume of 100 microliters in the 
wells of 96-well microtiter plates suitable for a plate reader. The compound CGA1 73506 
was obtained as a 50% wettable powder* and a stodc suspension was made up at a 
concentration of 10 mg/ml in sterile water. This stocic suspension was diluted with sterile 
water to provide the 173506 used in the tests. After the spores, medium, and 173506 were 
mixed, the turbidity in the wells was measured by reading the absori3ance at 600 nm in a 
plate reader. This reading was taken as the bacl^round turt^idity, and was subtracted from 
readings taken at later times. After 46 hours of incubation, the presence of 1 microgram/ml 
of 173506 was detemiined to reduce the growth of Dpiodia maydis by 64%, and after 120 
hours, the same concentration of 173506 inhibited the growth of Colletotrichum graminicola 
by 50%. After 40 hours of incubation, the presence of 0.5 microgram/ml of 173506 gave 
100% inhibition of Gtbberella zeae-maydis. 
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b) Pyrrolnitrin was tested for its effect on the growth of various maize fungal pathogens and 
inibited growth of Bipolaris maydis, Colletotrichum graminicola, D^bdia maydis, Fusarium 
moniliforme, Gibberetta zeae and Rhizoctania solanL 
To determine growth 

To determine growth inhibition autodaved filter discs (0.25 inch diameter from Sdileicrfier 
and Schuell) were placed near tiie perimeter of PDA (DIFCO) plates. Solutions were 
pipetted onto these filters. 2,5 micrograms pyrrolnitrin (25 microliter) were placed on one 
filter disc and 25 microliters 63% ethanol were placed on the other disc. Fungal plugs were 
taken from stock plates and placed in the center of the PDA plates. Each fungus was 
inoculated onto one plate, the fungus was allowed to grow and inhibition was scored at 
appropriate times. Inhibition of the fungi indicated above was visually detected. 

K. Expression of Antibiotic Biosvnthetic Genes in Transgenic Plants 
Example 34: Afodification of Coding Sequences and Adjacent Sequences 
The cloned APS biosynthetic genes described in tiiis application can be modified for 
expression in transgenic plant hosts. This is done with the am of producing extractable 
quantities of APS from transgenic plants (/.e. for similar reasons to those descrbed in 
Section E above), or altematively the aim of such expression can be the accumulation of 
APS in plant tissue for the provision of pathogen protection on host plants. A host plant 
expressing genes for the biosynthesis of an APS and v\^ich prcKiuoes the APS in its cells 
will have enhanced resistance to phytopathogen attack and will be thus better equipped to 
withstand crop tosses assodated vrith such attack. 

The transgenic expression in plants of genes derived from microbial sources may require 
the modification of those genes to achieve and optimize their expression in plants. In 
particular, bacterial ORFs which encode separate enzymes but which are encoded by the 
same transcript in the native microbe are best expressed in plants on separate transcripts. 
To achieve this, each microbial ORF is isolated individualiy and cloned within a cassette 
which provkies a plant promoter sequence at the 5' end of the ORF and a plant 
transcriptional temiinator at the 3' end of the ORF. The isolated ORF sequence preferably 
includes the initiating ATG codon and the temiinating STOP codon but may include 
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additional sequence beyond the initiating ATG and the STOP codon. In addition, the ORF 
may be truncated, but still retain the required actt\^ty; for particularly long ORFs, truncated 
versions which retain activity may be preferable for expression in transgenic organisms. By 
"plant promoter" and ''plant transcriptional temiinator" it is intended to mean promoters and ^ 
transcriptional terminators which operate within plant cells. This includes promoters and 
transcription terminators which may be derived from non-plant sources such as viruses (an 
example is the Cauliflower Mosaic Virus). 

in some cases, modification to the ORF coding sequences and adjacent sequence will not 
be required, it is sufficient to isolate a fragment containing the ORF of interest and to insert 
it downstream of a plant promoter. For example, Gaffney et ai (Science 261: 754-756 
(1993)) have expressed the Pseudomonas nahG gene in transgenic plants under the 
control of the CaMV 35S promoter and the CaMV tml tenninator successfully without 
modification of the coding sequence and with 56 bp of the Pseudomonas gene upstream of 
the ATG still attached, and 165 bp downstream of the STOP codon still attached to the 
nahG ORF. Preferably as littie adjacent microbial sequence should be left attached 
upstream of the ATG and downstream of the STOP codon. In practice, such construction 
may depend on the availability of restriction sites. 

in other cases, the expression of genes derived from microUal sources may provide 
problems in expression. These problems have been well characterized in the art and are 
particulariy common with genes derived from certain sources such as BacBlus. These 
problems may apply to the APS biosynttietic genes of this invention and the modification of 
tiiese genes can be undertaken using techniques now well laiown in the art The following 
problems may be encountered: 

(1) Codon Usage . The preferred codon usage in plants differs from the preferred oodon 
usage in certain microorganisms. Comparison of the usage of codons within a doned 
microbial ORF to usage in plant genes (and in particular genes from the target plant) will 
enable an identification of tiie codons within tiie ORF which should preferably be changed. 
Typically plant evolution has tended towards a strong preference of the nucleotides C and 
G in the tiiird t>ase position of monocot^edons, whereas dicotyledons often use the 
nucleotides A or T at this position. By modifying a gene to incorporate preferred codon 
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usage for a particular target transgenic species, many of the problems described below for 
GC/AT content and iliegitimate splicing will be overcome. 

(2) GC/AT Content . Plant genes typically have a GC content of more than 35%. ORF 
sequences which are rich in A and T nucleotides can cause several problems in plants. 
Firstly, motifs of ATTTA are believed to cause destabilization of messages and are found at 
the 3' end of many short-lived mRNAs. Secondly, the occunrence of poiyadenylation signals 
such as AATAAA at inappropriate positions within the message is believed to cause 
premature truncation of transcription. In addition, monocotyledons may recognize AT-rich 
sequences as splice sites (see below). 

(3) Sequences Adjacent to the Initiatino Methionine . Plants differ from microorganisms in 
that their messages do not possess a defined ribosome binding site. Rather, it is believed 
that ribosomes atteich to the 5' end of the message and scan for the first available ATG at 
which to start translation. Nevertheless, it is believed that there is a preference for certain 
nucleotides adjacent to the ATG and that expression of microbial genes can be enhanced 
by the inclusion of a eukaryotic consensus translation initiator at the ATG. Clontech 
(1993/1994 catalog, page 210) have suggested the sequence GTCGAC CATGG TC (SEQ ID 
N0:7) as a consensus translation initiator for the expression of the £. coli uidA gene in 
plants. Further, Joshi (NAR 15: 6643-6653 (1987)) has compared many plant sequences 
adjacent to the ATG and suggests the consensus TAAAC AATGG CT (SEQ ID N0:8). In 
situations where difficulties are encountered In the expression of microbial ORFs in plants, 
inclusion of one of these sequences at the initiating ATG may improve translation. In such 
cases the last three nucleotkies of tiie consensus may not be appropriate for indu^on in 
the modified sequence due to their modification of the second AA residue. PrefenBd 
sequences adjacent to the initiating methionine may differ between different plant spedes. 
A survey of 14 niaize genes located in the GenBank database provided the following 
results: 
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This analysis can be done for the desired plant species into v^ich APS genes are t>eing 
incorporated, and the sequence adjacent to the ATG modified to incorporate the preferred 
nucleotides. 

(4) Removal of llleoitimate Splice Sites. Genes cloned from non-plant sources and not 
optimized for expression in plants may also contain motifs which may be recognized in 
plants as 5' or 3' splice sites, and be cleaved, thus generating truncated or deleted 



Techniques for the modification of coding sequences and adjacent sequences are well 
known in the art. In cases where the initial expression of a microbial ORF is low and it is 
deehied appropriate to make alterations to the sequence as described at>ove, then the 
oonstnjction of synthetic genes can be accompHshed according to methods well known in 
the art. These are. for example, described in the published patent disclosures EP 0 385 
962 (to Monsanto). EP 0 359 472 (to Lubrizol) and WO 93/D7278 (to Ciba-Geigy). in most 
cases it is preferable to assay the expression of gene constructions using transient assay 
protocols (which are well known in the art) prior to their transfer to transgenic plarits. 



Example 35: Construction of Plant Transformation Vectors 

Numerous transformation vectors are available for plant transfomnation. and the genes of 
this invention can be used in conjunction with any such vectors. The selection of vector for 
use will depend upon the preferred transfonnation technique and the target species for 
transfomnation. For certain target species, different anta3iotic or herbicMe selection markers 
ntay be preferred. Selection markers used routinely in transformation include the nptll gene 
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which confers resistance to kanamycin and related antibiotics (Messing & Vierra, Gene 19: 
259-268 (1982); Bevan et aL, Nature 304:184-187 (1983)). the bar gene which confers 
resistance to the herbicide phosphinothridn (White et al., Nud Acids Res 18: 1062 (1990), 
Spencer et aL Theor AppI Genet 79: 625-631(1990)), the hph gene which confers 
resistance to the antibiotic hygromydn (Biochinger & Diggeimann, Mol Cell Biol 4: 2929- 
2931), and the cfAfrgene, which confers resistance to methotrexate (Bourouis etaL EMBO 
J. 2(7i: 1099-1104 (1983)). 

(1 ) Construction of Vectors Suitable for Agrobacterium Transf conation 

Many vectors are available for transfomfiation using Agrobacterium tumefadens. These 

typically carry at least one T-DNA border sequence and indude vectors such as pBIN19 

(Bevan, Nud. Adds Res. (1984)). Below the construction of two typical vectors is 

described. 

Construction of dCIB200 and pCIB2001 

The binary vectors pCIB200 and pCIB2001 are used for the constmction of recombinant 
vectors for use with Agrobacterium and was constructed in the following manner. 
pTJS75kan was created by A/ar/ digestion of pTJS75 (Schmidhauser & Helinsid, J Bacteriol. 
164 : 446-455 (1985)) allowing exdsion of the tetracycline-resistance gene, followed by 
insertion of an AccI fragment from pUC4K carrying an NPTII (Messing & Vierra, Gene 19: 
259-268 (1982); Bevan ef a/.. Nature 304: 184-187 (1983); McBride etal.. Plant Molecular 
Biology 14: 266-276 (1990)). Xhol linlcers were ligated to the £cof?1/ fragment of pCIB7 
which contains the left and right T-DNA borders, a plant selectable nos/nptll chimeric gene 
and the pUC polylinlcer (Rothstein ef a/., Gene 53: 153-161 (1987)), and ttie XhoWigested 
fragment was doned into Sa/Adigested pTJS75kan to create pCIB200 (see also EP 0 332 
104, example 19). pCIB200 contains the following unique pol^inker restriction sites: EcoRI, 
SstI, Kpr}l, Bglll, Xbal, and Sail. pCIB2001 is a derivative of pCiB200 which was created by 
the insertion into the polylinker of additional restriction sites. Unique restriction sites in the 
polylinl^er of pCIB2001 are EcoRI, SstI, Kpnl, Bglll, Xbal, Sail, Mlul, Bell, Avrll, Apal, Hpal. 
and StuL pCIB2001, in addition to containing these unique restriction sites also has plant 
and bacterial kanamycin selection, left and right T-DNA borders for Agrobacterium-mediaied 
transformation, the RK2-derived trfA function for mobilization between E coll and other 
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hosts. and the OnTand OnV functions also from RK2. The pClB2001 polylinker is suitable 
for the cloning of plant expression cassettes containing their own regulatory signals. 

Construction of pCIBIO and Hvoromvcin Selection Derivatives thereof 
The binary vector pCIBIO contains a gene encoding kanamycin resistance for selection in 
plants, T-DNA right and left border sequences and incorporates sequences from the wide 
host-range plasmid pRK252 allowing it to replicate in both £ coli end Agrobacterium. Its 
construction is described by Rothstein et al. (Gene 53: 153-161 (1987)). Various 
derivatives of pCIBIO have been constructed which incorporate the gene for hygromycin B 
phosphotransferase described by Gritz etal. (Gene 25: 179-188 (1983)). These derivatives 
enable selection of transgenic plant cells on hygromycin only (pCIB743), or hygromydn and 
kanamycin (pCIB71 5, pClB71 7). 

(2) Construction of Vectors Suitable for non-Agrobacterium Transformation. 
Transfomfiation without the use of Agrobacterium tumefadens drcumvents the requirement 
for T-DNA sequences in the chosen transformation vector and consequently vectors lacking 
tfiese sequences can be utilized in addition to vectors such as the ones described above 
which contain T-DNA sequences. Transformation techniques which do not rely on 
Agrobacterium indude transfomfiation via particle bombardment protoplast uptake {e.g. 
PEG and electroporation) and microinjection. The choice of vector depertds largely on the 
preferred selection for the ^cies being transformed. Below, the construction of some 
typical vectors is described. 

Construction of dCIB3064 

pCIB3064 is a pUC-derived vector suitable for direct gene transfer techniques in 
combination with selection by the herfoidde basta (or phosphinothridn). The plasmid 
pCIB246 comprises the CaMV 35S promoter in operational fusion to the £. coli GUS gene 
and the CaMV 358 transcriptional terminator and is described in the PCT published 
application WO 93/07278. The 35S promoter of this vector contains two ATG sequences 5' 
of the start site. These sites were mutated using standard PGR techniques in such a way 
as to remove the ATGs and generate the restricfion sites S^l and Pvull. The new 
restriction sites were 96 and 37 bp away from the unique 5a// site and 101 and 42 bp away 
from the actual start site. The resultant derivative of pCIB246 was designated pCIB3025. 
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The GUS gene was then excised from pCIB3025 by digestion with Sail and Sad. the 
termini rendered blunt and religated to generate plasmid pCIB3060. The piasmid pJIT82 
was obtained from the John Innes Centre, Nonwich and the a 400 bp Sma/ fragment 
containing the dar gene from Streptomyces \nridochromogenes was excised and inserted 
Into the Hpal site of pCIB3060 (Thompson et al. EMBO J 6: 2519-2523 (1987)), This 
generated pCIB3064 which comprises the bar gene under the control of the CaMV 35S 
promoter and terminator for herbicide selection, a gene for ampiciliin resistance (for 
selection in E co//) and a polylinker with the unique sites SphI, PstI, Hindlll, and BamHI. 
This vector is suitable for the cloning of plant expression cassettes containing their own 
regulatory signals. 

Construction of dS0G19 and dSOG35 

pSOG35 is a transfonnation vector which utilizes the £. coll gene dihydrofolate reductase 
(DHFR) as a selectable marker conferring resistance to methotrexate. PGR was used to 
amplify the 35S promoter (-800 bp), intron 6 from the maize Adhi gene (-550 bp) and 18 
bp of the GUS untranslated leader sequence from pSOGIO. A 250 bp fragment encoding 
the £ coli dihydrofolate reductase type II gene was also amplified by PGR and these two 
PGR fragments were assembled with a SachPstI fragment from pBI221 (Glontech) which 
comprised the pUG19 vector backbone and the nopaline synthase tenninator. Assembly of 
these fragments generated pS0G19 which contains the 35S promoter In fusion with the 
intron 6 sequence, the GUS leader, the DHFR gene and the nopaline synthase tenninator. 
Replacement of the GUS leader in pS0G19 with the leader sequence from Maize Chlorotic 
Mottle Virus (MCMV) generated the vector pSOG35. pS0G19 and pSOG35 carry the pUC 
gene for ampiciliin resistance and have Hindlll, SphI, PstI and EcoRI sites available for the 
cloning of foreign sequences. 

Example 36: Requirements for Construction of Plant Expression Cassettes 
Gene sequences intended for expression in transgenic plants are firstly assembled in 
expression cassettes behind a suitable promoter and upstream of a suitable transcription 
terminator. These expression cassettes can then be easily transferred to the plant 
transformation vectors described above in example 2-6, 
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Promoter Selection 

The selection of promoter used in expression cassettes will determine the spatial and 
temporal expression pattern of the transgene in the transgenic plant. Selected promoters 
will express transgenes in specific celt types (such as leaf epidemfial cells, meosphyll cells, 
root cortex cells) or in specific tissues or organs (roots, leaves or flowers, for example) and 
this selection will reflect the desired location of biosynthesis of the APS. Alternatively, the 
selected promoter may drive expression of the gene under a light-induced or other 
temporally regulated promoter. A further altemative is that the selected promoter be 
chemically regulated. This would provide the possibility of inducing the induction of the 
APS only when desired and caused by treatment with a chemical inducer. 

Transcriptional Terminators 

A variety of transcriptional temiinators are available for use in e)q)ression cassettes. These 
are responsible for the temnination of transcription beyond Hie transgene and its con^ct 
poiyadenyiation. Appropriate transcriptional temninators and those which are iaiown to 
function in plants and include the CaMV 35S tenminator, the tml terminatotp the nopaline 
synthase temiinator, the pea rbcS E9 terminator. These can be used in botii 
monocoylyedons and dicotyledons. 

Sequences for the Enhancement pr Regulation of Expression 

Numerous sequences have been found to enhance gene expression from witiiin tiie 
transcriptional unit and these sequences can be used in conjunction with the genes of this 
invention to increase their expression in transgenic plants. 

Various intron sequences have been shown to enhance expres^on, particuiariy in 
monocotyledonous cells. For example, the introns of the maize Adhi gene have been 
found to significantiy enhance the expression of the wild-type gene under its cognate 
promoter when introduced into maize cells. Intron 1 was found to i^e particuiariy effective 
and enhanced expression in fusion constructs with the chloramphenicol acetyltransferase 
gene (Callis etaL, Genes Develop 1: 1183-1200 (1987)). In tiie same experimental system, 
Uie intron from the msuze bromel gene had a similar effect in enhancing expression (Callis 
et ai, supra), intron sequences have been routinely incorporated into plant transformation 
vectors, typically within the non-translated leader. 
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A number of non-translated leader sequences derived from viruses are also known to 
enhance expression, and these are particularly effective in dicotyledonous cells. 
Specifically, leader sequences from Tobacco Mosaic Virus (TMV, the "ft-sequence"). Maize 
Chlorotic Mottle Virus (MCMV), and Alfalfa Mosaic Vims (AMV) have been shown to be 
effective in enhancing expression {e,g, Gallie etal. Nucl. Acids Res. 15: 8693-8711 (1987); 
Skuzeski et at. Plant Molec. Biol. 15; 65-79 (1 990)} 

Taraeting of the Gene Product Within the Cell 

Various mechanisms for targeting gene products are known to exist in plants and the 
sequences controlling the functioning of these mechanisms have been characterized in 
some detail. For example, the targeting of gene products to tfie chloroplast is controlled by 
a signal sequence found at the aminoterminal end of various proteins and which is cleaved 
during chloroplast import yielding the mature protein (e.g. Comai et ai J. Biol. Chem. 2^: 
15104-15109 (1988)). These signal sequences can be fused to heterologous gene 
products to effect the import of heterologous products into the chloroplast (van den Broeck 
etal. Nature 313: 358-363 (1985)). DNA encoding for appropriate signal sequences can be 
isolated from the 5' end of the cDNAs encoding the RUBISCO protein, the CAB protein, ttie 
EPSP synthase enzyme, the GS2 protein and many other proteins which are known to be 
chloroplast localized. 

Other gene products are localized to other organelles such as the mitochondrion and the 
peroxisome {e.g. Unger et al. Plant Molec. Biol. 13: 41 1-418 (1989)). The cDfslAs encoding 
these products can also be manipulated to effect the targeting of heterologous gene 
products to these organelles. Examples of such sequences are the nuclear-encoded 
ATPases and specific aspartate amino transferase isoforms for mitochondria. Targeting to 
cellular protein bodies has been described by Rogers etal. (Proc. Natl. Acad. Sd. USA 82: 
6512-6516(1985)). 

In addition sequences have been characterized which cause the targeting of gene products 
to other cell compartments. Aminotenninal sequences are responsible for targeting to the 
ER, the apoplast, and extracellular secretion from aleurone cells (Koehler & Ho, Plant Cell 
2: 769-783 (1990)). Additionally, aminotenninal sequences in conjunction with 
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carboxyterminal sequences are responsible for vacuolar targeting of gene products (Shinshi 
etaL Plant Molec. Biol. 14: 357-368 (1990)). 

By the fusion of the appropriate targeting sequences described above to transgene 
sequences of interest it is possible to direct the transgene product to any organelle or cell 
compartment. For chloroplast targeting, for example, the chloroplast signal sequence from 
the RUBISCO gene, the CAB gene, the EPSP synthase gene, or the GS2 gene is fused in 
frame to the aminoterminal ATG of the transgene. The signal sequence selected should 
include the known cleavage site and the fusion constructed should take into account any 
amino acids after the cleavage site which are required for cleavage. In some cases this 
requirement may be fulfilled by the addition of a small number of amino acids between the 
cleavage site and the transgene ATG or altematively replacement of some amino acids 
within the transgene sequence. Fusions constructed for chloroplast import can be tested 
for efficacy of chloroplast uptake by in vitro translation of in vitro transcribed constructions 
followed by in vitro chloroplast uptake using techniques described by (Bartlett et al. In: 
Edelmann etaL (Eds.) Methods in Chloroplast Molecular Biology. Elsevier, pp 1081-1091 
(1982); Wasmann et al. Mol. Gen. Genet. 205: 446-453 (1986)). These construction 
techniques are well known in the art and are equally applicable to mitochondria and 
peroxisomes. The choice of targeting which may be required for APS biosynthetic genes 
mil depend on the cellular localization of the precursor required as the starting point for a 
given pathway. This will usually be cytosolic or chloroplastic, although it may is some cases 
be mitochondrial or peroxisomal. The gene products of APS biosynthetic genes will not 
normally require targeting to the ER, ttie apoplast or the vacuole. 

The above described mechanisms for cellular targeting can be utilized not only in 
conjunction with ttieir cognate promoters, but also in conjunction with heterologous 
promoters so as to effect a spedfic cell targeting goal under the transcriptional regulation of 
a promoter virtiich has an expression pattern different to that of the promoter from which the 
targeting signsd derives. 
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Example 37: Examples of Expression Cassette Constnietion 

The present invention encompasses the expression of genes encoding APSs under the 
regulation of any promoter which is expressible in plants, regardless of the origin of the 
promoter. 

Furthennore. the invention encompasses the use of any plant-expressible promoter in 
conjunction with any further sequences required or selected for the expression of the APS 
gene. Such sequences include, but are not restricted to, transcriptional temiinators. 
extraneous sequences to enhance expression (such as introns {e.g. Adh intron 1), viral 
sequences (e. g. TMV-c)), and sequences intended for the taigeting of the gene product to 
specific organelles and cell compartments. 

Constitutiv e Expression: the CaMV 35S Promoter 

Construction of the plasmid pCGN1761 is described In the published patent application EP 
0 392 225 (example 23). pCGN1761 contains theldouble" 35S promoter and the tml 
transcriptional terminator with a unique EcoRl site between the promoter and the terminator 
and has a pUC-type backbone. A derivative of pCGN1761 was constructed which has a 
modified polylinker which includes A/of/ and Aho/ sites in addition to the existing £co/?/site. 
This derivative was designated pCGNI 761 ENX pCGN1761 ENX is useful for the cloning of 
cDNA sequences or gene sequences finduding microbial ORF sequences) within its 
polylinker for the purposes of their expression under the control of the 35S promoter In 
transgenic plants. The entire 35S promoter-gene sequence-ftn/tenninator cassette of such 
a construction can be excised by Hindlll. SphI, Sail, and Aba/ sites 5' to the promoter and 
Xbal. BamHI and ^11 sites 3' to the tenninator for transfer to transfonnation vectors such 
as those described above in example 35. Furthermore, the double 35S promoter fragment 
can be removed by 5' excision with Hindlll, SphI, Sail. Xbal, or Psf/, and 3" excision with 
any of the polylinker restriction sites (EcoRI, NotI or Xholi for replacement with anotiier 
promoter. 

Modification of DCGN1761 ENX bv Optimization of the Translational Initiation Site 
For any of tiie constroctions described in tiiis section, modifications around the cloning sites 
can be made by the introduction of sequences which may enhance translation. This is 
particuiariy useful when genes derived from microorganisms are to be introduced into plant 
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expression cassettes as these genes may not contain sequences adjacent to their initiating 
methionine which may be suitable for the initiation of translation In plants. In cases where 
genes derived from microorganisms are to be cloned into plant expression cassettes at their 
ATG it may be useful to modify the site of their insertion to optimize their expression. 
Modification of pCGN1761ENX is described by way of example to incorporate one of 
several optimized sequences for plant expression (e*^. Joshi, MAR 15: 6643-6653 (1987)). 

pCGN1761ENX is cleaved with SphI, treated with T4 DNA polymerase and religated, thus 
destroying the Sp/i/ site located 5* to the double 35S promoter. This generates vector 
pCGN1761ENX/Sph-. pCGN1761ENX/Sph- is cleaved with EcoRh and ligated to an 
annealed molecular adaptor of the sequence 5-AATTCTAAAGCATGCCGATCGG-3'(SEQ 
ID NO:9)/5'-AATTCCGATCGGCATGCTTTA-3* (SEQ ID NO:10). This generates the vector 
pCGNSENX which incorporates the qi/asAoptimized plant translational initiation sequence 
TAAA-C adjacent to the ATG which is itself part of an SphI site which is suitable for cloning 
heterologous genes at their initiating methionine. Downstream of the SphI site, the EcoRI, 
NotI, and Xhol sites are retained. 

An alternative vector is constmcted which utilizes an Ncol site at the initiating ATG. This 
vector, designated pCGNI 761 NENX is made by inserting an annealed molecular adaptor of 
the sequence 5*-AA7TCTAAACCATGGCGATCGG-3' (SEQ ID N0:11) / 
5'AATTCGGATCGCCATGGTTTA-3* (SEQ ID N0:12) at the pCGN1761ENX EcoRI site 
(Sequence ID'S 14 and 15). Thus, the vector includes the QuasAoptimized sequence 
TAAAGG adjacent to the initiating ATG which is within the Ncol^e. Downstream sites are 
EcoRI, NotI, and XhoL Prior to this manipulation, however, the two Ncol sites In the 
pCGN1761ENX vector (at upstream positions of the 5' 35S promoter unit) are destroyed 
using similar techniques to those described above for SphI or alternatively using inside- 
outside** PGR (Innes at at. PGR Protocols: A guide to methods and applications. Academic 
Press, New Yoric (1990); see Example 41). This manipulation can be assayed for any 
possible detrimental effect on expression by insertion of any plant cDNA or reporter gene 
sequence into the cloning site followed by routine expression analysis in plants. 
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Expression under a Chemically Reoulatable Promoter 

This section describes the replacement of the double 35S promoter in pCGN1761ENX with 
any promoter of choice; by way of example the chemically regulated PR-1a promoter is 
described. The promoter of choice is preferably excised from its source by restriction 
enzymes, but can alternatively be PCR-amplified using primers which carry appropriate 
tenninal restriction sites. Should PCR-amplification be undertaken, then the promoter 
should be resequenced to check for amplification enrors after the doning of the amplified 
promoter in the target vector. The chemically regulatable tobacco PR-la promoter is 
cleaved from plasmid pCIB1004 (see EP 0 332 104. example 21 for constmction) and 
transfenred to plasmid pCGN1761 ENX. pCiB1004 is cleaved vwth Ncol and the resultant 3* 
overiiang of the linearized fragment is rendered blunt by treatment with T4 DNA 
polymerase. The fragment is then cleaved with Hindlll and the resultant PR-la promoter 
containing fragment is gel purified and cloned into pCGN 1761 ENX from which the double 
35S promoter has t>een removed. This is done by cleavage with Xhol and blunting with T4 
polymerase, followed by cleavage with Hindlll and isolation of the larger vector-temiinator 
containing fragment Into which the pCIB1004 promoter fragment is cloned. This generates 
a pCGN1761ENX derivative with the PR-la promoter and the (m/ terminator and an 
intervening polylinker with unique EcoRI and NotI sites. Selected APS genes can be 
inserted into this vector, and the fusion products {i.e. promoter-gene-terminator) can 
subsequently be transfenred to any selected transfonnation vector, including those 
described in this application. 

Constitutive Exoression: ttie Acttn Promoter 

Several isoforms of actin are known to be expressed in most cell types and consequently 
the actin promoter is a good choice for a constitutive promoter. In particular, tiie promoter 
from tiie rice Act1 gene has been cloned and characterized (McElroy et al. Plant Cell 2: 
1 63-1 71 (1 990)). A 1 .3 kb fragment of the promoter was found to contain all the regulatory 
elements required for expression in rice protoplasts. Furtiiermore, numerous expression 
vectors based on the Acti promoter have been constructed spedficaliy for use in 
monocotyledons (McEiroy etal. Mol Gen. Genet 231: 150-160 (1991)). These incorporate 
the ActUntron 1, Adhi 5' flanking sequence and AdhUntton 1 (from ttie maize alcohol 
dehydrogenase gene) and sequence from the CaMV 35S promoter. Vectors showing 
highest expression were fusions of 35S and the Acti intron or the Acti 5' flanking sequence 
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and the Acti intron. Optimization of sequences around ttie initiating ATG (of the GUS 
reporter gene) also enhanced expression. The promoter expression cassettes descrii^ed by 
McElroy et al (Mol. Gen, Genet. 231: 150-160 (1991)) can be easily modified for ttie 
expression of APS biosynthetic genes and are particulariy suitable for use in 
monocotyledonous hosts. For example, promoter containing fragments can be removed 
from the McElroy constructions and used to replace the double 35S promoter in 
pCGN1761ENX, which is then available for the insertion of specific gene sequences. The 
fusion genes thus constructed can then be transferred to appropriate transformation 
vectors. In a separate report the rice Acti promoter with its first intron has also been found 
to direct high expression in cultured barley cells (Chibbar etal. Plant Cell Rep. 12: 506-509 
(1993)). 

Constitutive Expression: the Ubiouitin Promoter 

Ubiquitin is another gene product known to accumulate in many call types and its promoter 
has been cloned from several species for use in transgenic plants (e.g. sunflower - Binet et 
al. Plant Science 79: 87-94 (1991). maize - Christensen etaL Plant Molec. Biol. 12: 619-632 
(1989)). The maize ubiquitin promoter has been developed in transgenic monocot systems 
and its sequence and vectors constructed for monocot transformation are disclosed in the 
patent publication EP 0 342 926 (to Lubrizol). Further. Taylor et al. (Plant Cell Rep. 12: 
491-495 (1993)) describe a vector (pAHC25) which comprises the maize ubiquitin promoter 
and first intron and its high activity in cell suspensions of numerous monocotyledons when 
introduced via microprojectile bombardment The ubiquitin promoter is dearly suitable for 
the expression of APS biosynthetic genes in transgenic plants, espedally monocotyledons. 
Suitable vectors are derivatives of pAHC25 or any of the transfomnation vectors descrbed 
in this application, modified by the introdudion of the appropriate ubiquitin promoter and/or 
intron sequences. 

Root Specific Expression 

A preferred pattern of expression for the APSs of the instant invention Is root expression. 
Root expression is particulariy useful for the control of soil-bome phytopathogens such as 
Rhizoctonia and Pythium. Expression of APSs only in root tissue would have the 
advantage of controlling root invading phytopathogens. without a concomitant accumulation 
of APS in leaf and flower tissue and seeds. A suitable root promoter is that described by de 
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Framond (FEBS 290: 103-106 (1991)) and also in the published patent application EP 0 
452 269 (to Ciba-Geigy). This promoter is transfen-ed to a suitable vector such as 
pCGNI 761 ENX for the insertion of an APS gene of interest and subsequent transfer of the 
entire promoter-gene-terminator cassette to a transfonnation vector of interest 

Wound Inducible Promoters 

Wound-lnducible promoters are particularly suitable for the expression of APS biosynthetic 
genes because they are typically active not just on wound induction, but also at the sites of 
phytopathogen infection. Numerous such promoters have been described {e.g. Xu et al. 
Plant Molec, BioL 22: 573-588 (1993), Logemann et a/. Plant Cell 1: 151-158 (1989), 
Rohrmeier & Lehle, Plant Molec. Biol. 22: 783-792 (1993). Firek et al. Plant Molec. BioL 22: 
129-142 (1993). Warner et al Plant J. 3: 191-201 (1993)) and all are suitable for use with 
the instant invention. Logemann et al. {supra) describe the 5' upstream sequences of the 
dicotyledonous potato wuni gene. Xu et al. (supra) show that a wound inducible promoter 
from the dicotyledon potato (p/n2) is active in tiie monocotyledon rice. Further, Rohnneier & 
Lehle (supra) describe the cloning of the maze Wipl cDNA which is wound induced and 
which can be used to isolated the cognate promoter using standard techniques. Similariy, 
Firek et al. (supra) and Warner et al. (supra) have described a wound induced gene from 
the monocotyledon Asparagus officinalis which is expressed at local wound and pathogen 
invasion sites. Using cloning techniques well known in the art these promoters can be 
transfen-ed to suitable vectors, fused to tiie APS biosyntiietic genes of this invention, and 
used to express these genes at the sites of phytopatiiogen infection. 
Pith Preferred Expression 

Patent Application WO 93/07278 (to Ciba-Gelgy) describes the isolation of the maize trpA 
gene which is preferentially expressed in pitii ceils. The gene sequence and promoter 
extending up to nucleotide -1726 from tiie start of transcription are presented. Using 
standard molecular biological techniques, this promoter or parts thereof , can be transferred 
to a vector such as pCGN1761 where it can replace the 35S promoter and be used to drive 
the expression of a foreign gene in a pith-preferred manner. In fact fragments containing 
the pith-preferred promoter or parts tiiereof can be transferred to any vector and modified 
for utility in transgenic plants. 
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Pollen-Specific Expression 

Patent Application WO 93/07278 (to Ciba-Geigy) further describes the isolation of the 
maize calcium-dependent protein kinase (CDPK) gene which is expressed in pollen cells. 
The gene sequence and promoter extend up to 1400 bp from the start of transcription. y 
Using standard molecular biological techniques, this promoter or parts thereof, can be 
transfen^ed to a vector such as pCGN1761 where it can replace the 35S promoter and be 
used to drive the expression of a foreign gene in a pollen-specific manner* In fact 
fragments containing the pollen-specific promoter or parts thereof can be transfen'ed to any 
vector and modified for utility in transgenic plants. 

Leaf-Specific Expression 

A maize gene encoding phosphoenol carboxylase (PEPC) has been described by Hudspeth 
& Gruia (Plant Moiec Biol 12: 579-589 (1989)). Using standard molecular biological 
techniques the promoter for this gene can be used to drive the expression of any gene in a 
ieaf-specific manner in transgenic plants. 

Expression with Chloroplast Targeting 

Chen & Jagendorf (J. Biol. Chem. 2^: 2363-2367 (1993) have described the successful 
use of a chloroplast transit peptide for import of a heterologous transgene. This peptide 
used is the transit peptide from the rbcS gene from Nicotiana fHunibagintfona (Poulsen et al. 
Moi. Gen. Genet. 205: 193-200 (1986)). Using the restriction enzymes Dra/ and Sp/i/. or 
Tsp509l and SphI the DMA sequence encoding this transit peptide can be excised from 
plasmid prbcS-8B (Poulsen et al. supra) and manipulated for use with any of the 
constmctions described above. The DralSphl fragment extends from -58 relative to ttie 
initiating rbcSfiJG to. and including, the first amino acid (also a methionine) of the mature 
peptide immediately after tiie import cleavage site, whensas the TspSOBI-Sphl fragment 
extends from -8 relative to ttie initiating rbcS ATG to, and including, tiie first amino add of 
the mature peptide. Thus, these fragment can be appropriately inserted Into ttie polylinker 
of any chosen expression cassette generating a transcriptional fusion to the untranslated 
leader of the chosen promoter {e.g. 35S, PR-la, actin, ubiqultin eta), whilst enabling the 
insertion of a required APS gene in correct fusion downstream of ttie transit peptide. 
Constnjctions of this kind are routine in tiie art For example, whereas the Oral end is 
already blunt, the 5' Tsp509l site may be rendered blunt by T4 polymerase treatment, or 
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may alternatively be ligated to a linker or adaptor sequence to facilitate its fusion to the 
* chosen promoter. The 3' SphI site may be m^ntained as such, or may alternatively be 

ligated to adaptor or linker sequences to facilitate its insertion into the chosen vector in such 
r a way as to make available appropriate restriction sites for the subsequent insertion of a 

selected APS gene. Ideally the ATG of the SphI site is maintained and comprises the first 
ATG of the selected APS gene. Chen & Jagendorf {supra) provide consensus sequences 
for ideal cleavage for chloroplast import, and in each case a methionine is prefen'ed at the 
first position of the mature protein. At subsequent positions there is more variation and the 
amino acid may not be so critical. In any case, fusion constructions can be assessed for 
efficiency of import in vitro using the methods described by Bartlett et al (in: Edelmann et 
ai (Eds.) Methods in Chloroplast Molecular Biology, Elsevier, pp 1081-1091 (1982)) and 
Wasmann etal (Mol. Gen. Genet. 205: 446-453 (1986)). Typically the best approach may 
be to generate fusions using the selected APS gene with no modifications at the 
aminoterminus, and only to incorporate modifications when it is apparent that such fusions 
are not chloroplast imported at high efficiency, in which case modifications may be made in 
accordance with the established literature (Chen & Jagendorf. sap/a; Wasman etal., supra; 
Ko & Ko, J. Biol. Chem. 267: 13910-13916 (1992)). 

A preferred vector is constructed by transferring the DralSpM transit peptide encoding 
fragment from prbcS-8B to the cloning vector pCGN1761ENX/Sph-. This plasmid is 
cleaved with EcoRI and the termini rendered blunt by treatment with T4 DNA polymerase. 
Plasmid pri3cS-8B is cleaved with SphI and ligated to an annealed molecular adaptor of the 
sequence 5'-CCAGCTGGAATTCCG-3' (SEQ ID NO:13)y5*-CGGAATTCCAGCTGGCATG-3' 
(SEQ ID N0:14). The resultant product is 5'-terminally phosphorytated by treatment with T4 
kinase. Subsequent cleavage with Dral releases the transit peptide encoding fragment 
which is ligated into the blunt-end ex-EcoRI sites of the modified vector described above. 
Clones oriented with the 5' end of the insert adjacent to the 3' end of the 35S promoter are 
identified by sequencing. These clones carry a DNA fusion of the 35S leader sequence to 
the rbcS-BA promoter-transit peptide sequence extending from -58 relative to the rbcS ATG 
to the ATG of the mature protein, and including at that position a unique SphI site, and a 
newly created EcoRI site, as well as the existing NotI and Xhol sites of pCGN1761ENX. 
This new vector is designated pCGN1761/CT. DNA sequences are transfenred to 
pCGN1761/CT in frame by amplification using PCR techniques and incorporation of an 
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Sphl, NsphI, or A//a///site at the amplified ATG, which following restriction enzyme cleavage 
with the appropriate enzyme is ligated into Sp/?Acleaved pCGN1761/CT. To facilitate 
construction, it may be required to change the second amino acid of the cloned gene, 
however, in almost all cases the use of PGR together with standard site directed 
mutagenesis will enable the construction of any desired sequence around the cleavage site 
and first methionine of the mature protein. 

A further preferred vector is constructed by replacing the double 35S promoter of 
pCGN1761ENX with the BamHI-Sphl fragment of prbcS-8A which contains the full-length 
light regulated rbcS-SA promoter from nucleotide -1038 (relative to the transcriptional start 
site) up to the first methionine of the mature protein. The modified pGGN1761 with the 
destroyed Sphl site is cleaved with PstI and EcoRl and treated with T4 DMA polymerase to 
render termini blunt. prbcS-BA is cleaved Sphl and ligated to the annealed molecular 
adaptor of the sequence described above. The resultant product is 5 -tenminally 
phosphorylated by treatment with T4 Idnase. Subsequent cleavage with BamHI releases 
the promoter-transit peptide containing fragment which is treated vwth T4 DMA polymerase 
to render the BamHI terminus blunt The promoter-transit peptide fragment thus generated 
is cloned into the prepared pGGN1761 ENX vector, generating a construction comprising the 
rbcS-SA promoter and transit peptide with an Sphl site located at the cleavage site for 
insertion of heterologous genes. Further, downstream of the Sphl site there are EcoRI (re- 
created), Notl, and X/?o/ cloning sites. This construction is designated pGGN1761rbcS/CT. 

Similar manipulations can be undertaken to utilize ottier GS2 chioropiast transit peptide 
encoding sequences from other sources (monocotyledonous and dicotyledonous) and from 
other genes. In addition, similar procedures can be followed to achieve targeting to other 
subcellular compartments such as mitochondria. 

Example 38: Techniques for the isolation of New Promoters Suitable for the 
Expression of APS Genes 

New promoters are isolated using standard molecular biological techniques including any of 

the techniques described below. Once isolated^ they are fused to reporter genes such as 

GUS or LUC and their expression pattern in transgenic plants analyzed (Jefferson et a!. 
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EMBO J. 6: 3901-3907 (1987); On et al. Science 234: 856-859 (1986)). Promoters which 
show the desired expression pattern are fused to APS genes for expression in planta. 

^ Subtractive cDNA Cloning 

Subtractive cDNA cloning techniques are useful for the generation of cDNA libraries 
enriched for a particular population of mRNAs {e.g. Hara et al. Nucl. Adds Res. 19: 1097- 
7104 (1991)). Recently, techniques have been described which allow the constmction of 
subtractive libraries from small amounts of tissue (Sharma et al. Biotechniques 15: 610-612 
(1 993)). These techniques are suitable for the enrichment of messages spedfic for tissues 
which may be available only in small amounts such as the tissue immediately adjacent to 
wound or pathogen infection sites. 

Differential Screening bv Standard Plus/Minus Techniques 

X phage carrying cDNAs derived from different RNA populations {viz. root versus whole 
plant, stem specific versus whole plant, local patiiogen infection points versus whole plant, 
etc.) are plated at low density and transferred to two sets of hybridization filters (for a review 
of differential screening tedinlques see Calvet, Pediatr. Nephrol. 5: 751-757 (1991). 
cDNAs derived from the "choice" RNA population are hybridized to the first set and cDNAs 
from whole plant RNA are hybridized to ttie second set of filters. Plaques which hybridize to 
the first probe, but not to ttie second, are selected for further evaluation. They are picked 
and tiieir cDNA used to screen Norttiem blots of •'choice" RNA versus RNA from various 
other tissues and sources. Clones showing the required expression pattem are used to 
done gene sequences from a genomic library to enable the isolation of the cognate 
promoter. Between 500 and 5000 bp of tiie doned promoter is then fused to a reporter 
gene {e.g. GUS. LUC) and reintroduced into transgenic plants for expression analysis. 

Differential Screening bv Differential Disolav 

RNA is isolated from different sources i.e. the choice source and whole plants as controlt 
and subjected to ttie differential display technique of Liang and Pardee (Sdence 257: 967- 
971 (1992)). Amplified fragments which appear In tiie choice RNA. but not tiie control are 
gel purified and used as probes on Norfliem blots carrying different RNA samples as 
described above. Fragments which hybridb:e selectively to the required RNA are cloned 
and used as probes to isolate tiie cDNA and also a genomic DNA fragment from which the 
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promoter can be isolated. The isolated promoter is fused to a GUS or LUC reporter gene 
as described above to assess its expression pattern in transgenic plants. 

Promoter Isolation Usino "Promoter Trao" Technology 

The insertion of promoterless reporter genes into transgenic plants can be used to identify 
sequences in a host plant which drive expression in desired cell types or with a desired 
strength. Variations of this technique is described by Ott & Chua (MoL Gen. Genet. 2^: 
169-179 (1990)) and Kertbundit etal. (Proc. Natl. Acad. Sci. USA 88: 5212-5216 (1991)). In 
standard transgenic experiments the same principle can be extended to identify enhancer 
elements In the host genome where a particular transgene may be expressed at particulariy 
high levels. 

Example 39: Transformation of Dicotyledons 

Transformation techniques for dicotyledons are well known in the art and include 
AgrobacteriunhbBsed techniques and techniques which do not require Agrobacterium. 
Hon-Agrabaaeriumtechniqaes involve the uptake of exogenous genetic material directiy by 
protoplasts or cells. This can be accomplished by PEG or electroporation mediated uptake, 
particle bombardment-mediated delivery, or microinjection. Examples of these tadinuiues 
are described by Paszkowski etal., EMBO J 3: 2717-272i2 (1984), Potrykus etal.. Mol Gen. 
Genet. 199: 169-177 (1985), Reich etal.. Biotechnology 4: 100M004 (1986), and Klein et 
al.. Nature 327: 70-73 (1987). In each case the transformed cells are regenerated to whole 
plants using standard techniques known in the art 

AgrobacteriunHnedlated transfonnation is a prefemed technique for transformation of 
dicotyledons because of its high efficteney of transformation and its broad utility with many 
different species. The many crop species which are routinely transformable by 
Agrobacterium include tobacco, tomato, sunflower, cotton, oilseed rape, potato, Myt>ean, 
alfalfa and poplar (EP 0 317 511 (cotton), EP 0 249 432 (tomato, to Calgene), WO 
87/07299 {Brassica, to Calgene), US 4,795,855 (poplar)). Agrobacterium transfonnation 
typically involves the transfer of the binary vector carrying the foreign DNA of interest {e.g. 
pCIB200 or pCIB2001) to an appropriate Agrobacterium strain which may depend of the 
complement of v/r genes carried by the host Agmtmcterium strain etther on a co-re^ent Tt 
plasmid or chromosomally {e.g. strain CIB542 for pCIB200 and pCI^OOl (Uknes et al. 
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Plant Cell 5: 159-169 (1993)). The transfer of the recombinant binary vector to 
Agrobacterium is accomplished by a triparental mating procedure using E. coli earring the 
recombinant binary vector, a helper E. co// strain which carries a plasmid such as pRK2013 
and which is able to mobilize the recombinant binary vector to the target Agrobacterium 
strain. Alternatively, the recombinant binary vector can be transferred to Agrobacterium by 
DNA transformation (H6fgen & Wilimitzer, Nud. Adds Res. 1g: 9877(1988)). 

Transformation of the target plant spedes by recombinant Agrobacterium usually involves 
co-cultivation of the Agrobacterium with explants from the plant and follows protocols well 
known in the art Transfonmed tissue is regenerated on selectable medium carrying the 
antibiotic or herbidde resistance marker present between the binary plasmid T-DNA 
borders. 

Example 40: Transformation of Monocotyledons 

Transformation of most monocotyledon spedes has now also become routine. Prefen-ed 
techniques indude direct gene transfer into protoplasts using PEG or electroporation 
techniques, and partide bombardment into callus tissue. Transformations can be 
undertaken with a single DNA spedes or multiple DNA spedes (/.e. co-transformation) and 
both these techniques are suitable for use with this invention. Co-transformation may have 
the advantage of avoiding complex vector construction and of generating transgenic plants 
with unlinked loci for the gene of interest and the selectable marker, enabling the removal of 
the selectable marker in subsequent generations, should this be regarded desirable. 
However, a disadvantage of the use of co-transformation is the less than 100% frequency 
with which separate DNA species are integrated Into the genome (Sdiocher et al. 
Biotechnology 4: 1093-1096 (1986)). 

Patent Applications EP 0 292 435 (to Ciba-Getgy). EP 0 392 225 (to Ciba-Geigy) and WO 
93/07278 (to Ciba-Geigy) describe techniques for the preparatran of callus and protoplasts 
from an 6lite inbred line of maize, transformation of protoplasts using PEG or 
electroporation, and the regeneration of maize plants from transformed protoplasts. 
Gordon-Kamm etal. (Plant Cell 2: 603-618 (1990)) and Fromm etal. (Biotechnology 8: 833- 
839 (1990)) have published techniques for transformation of A188-derived maize line using 
partide bombardment Furttiennore, application WO 93/07278 (to Ciba-Geigy) and Koziel 
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et al. (Biotechnology 11: 194-200 (1993)) describe techniques for the transformation of Slite 
inbred lines of maize by particle bombardment. This tedinique utilizes immature maize 
embryos of 1.5-2.5 mm length excised from a maize ear 14-15 days after pollination and a 
PDS-1 OOOHe Biolistics device for bombardment 

Transformation of rice can also be undertaken by direct gene transfer techniques utilizing 
protoplasts or psuticle bombardment Protoplast-mediated transformation has been 
described for Japonica4ypes and Indica-types (Zhang et aL, Plant Cell Rep 7: 379-384 
(1988); Shimamoto etaL Nature 338: 274-277 (1989); Datta etal. Biotechnology 8: 736-740 
(1990)). Both types are also routinely transformable using particle bombardment (Christou 
et al. Biotechnology 9: 957-962 (1 991 )). 

Patent Application EP 0 332 581 (to Ciba-Geigy) describes techniques for the generation, 
transfomnation and regeneration of Pooideae protoplasts. These techniques allow the 
transformation of Dactylis and wheat Furthermore, wheat transformation was been 
described by Vasil et aL (Biotechnology 10: 667-674 (1992)) using particle bombardment 
Into ceils of ^e C long-tenn regenerable callus, and also by Vasil et al. (Biotechnology H: 
1553-1558 (1993)) and Weeks et al. (Plant Physiol. 102: 1077-1084 (1993)) using particle 
bombardment of immature embryos and immature embryo-derived callus. A preferred 
technique for wheat transformation, however, involves the transformation of wheat by 
particle bombardment of immature embryos and includes either a high sucrose or a high 
maltose step prior to gene delivery. Prior to bombardment, any number of embryos (0.75-1 
mm in length) are plated onto MS medium with 3% sucrose (^Ajlrash^a & Skoog. 
Physiotogia Plantanjm 15: 473-497 (1962)) and 3 mg/l 2,4-D for induction of somatic 
embryos which is allowed to proceed In the dark. On the chosen day of bombardment, 
embryos are removed from the inducfion medium and placed onto the osmoticum (/.e. 
induction medium with sucrose or maltose added at the desired concentration, typically 
15%). The embryos are allowed to piasmolyze for 2-3 h and are then bombarded. Twenty 
embryos per target plate is typic^, although not critical. An appropriate gene-canrying 
plasmid (such as pCIB3064 or pSG35) is precipitated onto micrometer size gold particles 
using standard procedures. Each plate of embryos is shot with the DuPont Biolistics* 
helium device using a burst pressure of '•'lOOO psi using a standard 80 mesh screen. After 
bombardment, the embryos are placed back into the daric to recover for about 24 h (stall on 
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osmoticum). After 24 hrs, the embryos are removed from the osmoticum and placed back 
onto induction medium where they stay for about a month before regeneration. 
Approximately one month later the embryo explants with developing embryogenic callus are 
transferred to regeneration medium (MS + 1 mg/liter NAA, 5 mg/liter GA). further containing 
the appropriate selection agent (10 mg/l basta in the case of pCIB3064 and 2 mg/l 
methotrexate in the case of pSOG35). After approximately one month, developed shoots 
are transfen^d to larger sterile containers known as "GA7s" which contained half -strength 
MS, 2% sucrose, and the same concentration of selection agent. Patent application WO 
94/13822 describes methods for wheat transfonnation and is hereby incorporated by 
reference. 

Example 41 : Expression of Pyrrolnitrtn In Transgenic Plants 

The GC content of all four pyrrolnitrin ORFs is between 62 and 68% and consequently no 
AT-content related problems are anticipated with their e)qDression in plants. It may. 
however, be advantageous to modify the genes to include codons preferred In the 
appropriate target plant species. Fusions of the kind described below can be made to any 
desired promoter vwth or without modification {e.g. for optimized translational initiation In 
plants or for enhanced expression). 

Expression behind the 35S Promoter 

Each of the four pyrrolnitrin ORFs is transferred to pBluescript KS II for further manipulation. 
This is done by PGR amplification using primers homologous to each end of each gene and 
which additionally include a restriction site to fadlltate ttie transfer of tiie amplified 
fragments to the pBluescript vector. For ORFI, ttie aminoterminal primer includes a Sail 
site and the carboxytenninal primer a NotI site. Similariy for 0RF2, ttie aminotenminal 
primer includes a Sail site and ttie cari^oxyterminal primer a NotI site. For ORFS, ttie 
aminotenninal primer includes a NotI site and ttie cart^oxytermlnal primer an Xhol site. 
Similariy for 0RF4, the aminoterminal primer indudes a NotI site and the carboxytenninal 
primer an Xhol site. Thus, the amplified fragments are cleaved witii ttie appropriate 
restriction enzymes (diosen because they do not cleave wittiin the ORF) and are then 
ligated into pBluescript, also correspondingly cleaved. The cloning of the individual ORFs in 
pBluescript facilitates ttieir subsequent manipulation. 
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Destruction of internal restriction sites which are required for further construction is 
undertaken using the procedure of Inside-outside PGR" (Innes et al. PGR Protocols: A 
guide to methods and applications. Academic Press. New York (1990)). Unique restriction 
sites sought at either side of the site to be destroyed (ideally between 100 and 500 bp from ^ 
the site to be destroyed) and two separate amplifications are set up. One extends from the 
unique site left of the site to be destroyed and amplifies DNA up to the site to be destroyed 
with an amplifying oligonucleotide which spans this site and incorporates an appropriate 
base change. The second amplification extends from the site to be destroyed up to the 
unique site rightwards of the site to be destroyed. The oligonucleotide spanning the site to 
be destroyed in this second reaction incorporates the same base change as In the first 
amplification and ideally shares an overiap of between 10 and 25 nucleotides with the 
oligonucleotide from the first reaction. Thus the products of botii reactions share an overtap 
which incorporates ttie same base change in the restriction site corresponding to that made 
in each amplification. Following the two amplifications, the amplified products are gel 
purified (to remove the four oligonucleotide primers used), mixed togetiier and reamplified in 
a PGR reaction using the two primers spanning the unique restriction sites, in tiiis final 
PGR reaction the oy/eAap between the two ampfified fragments provides the priming 
necessary for the first round of syntiiesis. The product of tiiis reactions extends from the 
leftwards unique restriction site to the rightwards unique restriction site and includes the 
modified restriction site located internally. This product can be cleaved with the unique sites 
and inserted into tiie unmodified gene at the appropriate location by repladng the vtrald-type 
fragment. 

To render 0RF1 free of the first of its two internal Sp/?/ sites oligonudeotides spanning and 
homologous to the unique Xmal and Esp/ are designed. The Xmal oligonucleotide is used 
in a PGR reaction togetiier with an oligonucleotide spanning the first SphI site and which 
comprises tiie sequence ....CCCCGICATGC.... (lower strand, SEQ ID N0:15), thus 
introducing a base change into to SphI site. A second PGR reaction utilizes an 
oligonucleotide spanning tiie SphI site (upper strand) comprising the sequence 
....GGATGAGGGGG.... (SEQ ID N0:16) and is used in combination witii tiie E^l site- 
spanning oligonucleotide. The two products are gel purified and themselves amplified with 
the Xmal and £sp/-spanning oligonucleotides and the resultant fragment is cleaved with 
Xmal and Espl and used to replace the native fragment in tiie 0RF1 clone. According to 
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the above description, the modified SphI site is GCATGA and does not cause a codon 
change- Other dianges in this site are possible (/.a changing the second nucleotide to a G. 
T, or A) without conupting amino acid Integrity. 

A similar strategy is used to destroy the second SphI site in ORF1 . In this case, Espl is a 
suitable leftwards-located restriction site, and the rightwards-located restriction site is PstI, 
located close to the 3* end of the gene or alternatively SsW which is not found in the ORF 
sequence; but immediately adjacent in the pBiuescript polylinker. In this case an 
appropriate oligonucleotide is one which spans this site, or alternatively one of the available 
pBiuescript sequencing primers. This SphI site is modified to GAATGC or GCATGT or 
GAATGT. Each of these changes destroys the site without causing a codon change. 

To render 0RF2 free of its single SphI site a similar procedure is used. Leftward restriction 
sites are provided by PstI or Mlul, and a suitable rightwards restriction site is provided by 
SstI in the pBiuescript polylinker. In this case the site is changed to GCTTGC, GCATGC or 
GCTTGT; these changes maintain amino acid integrity. 

0RF3 has no internal Sp/}/ sites. 

In the case of 0RF4, Psf/ provides a suitable rightwards unique site, but there is no suitable 
site located leftwards of the single Spft/ site to be changed. In this case a restriction site in 
the pBiuescript polylinker can be used to the same effect as already described above. The 
Sp/j/site is modified to GGATGC. GTATGC, GAATGC, or GCATGT etc.. 

The removal of SphI sites from the pyrrolnltrin biosynthetic genes as described above 
fadlitates their transfer to the pCGN1761SENX vector by amplification using an 
aminotemiinal oligonucleotide primer which incorporates an SphI site at the ATG and a 
cart30xytenninal primer which incorporates a restriction site not found in the gene being 
amplified. The resultant amplified fragment is cleaved with SphI and the restriction enzyme 
cutting the carboxyterminal sequence and cloned into pCGN1761SENX. Suitable restriction 
enzyme sites for Incorporation into the cartsoxytemiinal primer are NotI (for all four ORFs). 
Xhol (for ORFS and 0RF4). and EcoRI (for 0RF4). Given the requirement for the 
nucleotide C at position 6 vwthin the Sp/)/ recognition site, in some cases the second codon 
of the ORF may require changing so as to start with the nucleotide C. This constnjction 
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fuses each ORF at its ATG to the SphI sites of the translation-optimized vector 
pCGN1761SENX in operable linkage to the double 35S promoter. After construction is 
complete the final gene insertions and fusion points are resequenced to ensure that no 
undesired base changes have occurred. 

By utilizing an aminoterminal oligonucleotide primer which incorporates an Ncol site at its 
ATG instead of an SphI she, ORFs 1-4 can also be easily cloned into to the translation- 
optimiz^ vector pGGN1761NENX. None of the four pynrolnitrin biosynthetic gene ORFs 
carry an Ncol site and consequently there is no requirement in this case to destroy intemal 
restriction sites. Primers for the cariDoxyterminus of the gene are designed as described 
above and the cloning is undertaken in a similar fashion. Given the requirement for the 
nucleotide G at position 6 within the Ncol recognition site, in some cases the second codon 
of the ORF may require changing so as to start with the nucleotide G. This construction 
fuses each ORF at its ATG to the Ncol site of pGGN1761NENX.in operable linkage to the 
double 35S promoter. 

The expression cassettes of the appropriate pCGNI 761 -derivative vectors are transfenred 
to transfomiation vectors. Where possible multiple expression cassettes are transferred to 
a single transfomiation vector so as to reduce the number of plant transformations and 
crosses between transformants which may be required to produce plants expressing all four 
ORFs and thus producing pyrrolnitrin. 

Expression behind 35S with Chloroplast Tarqetinq 

The pynrolnitrin ORFs 1-4 amplified using oligonucleotides canrying an SphI ^ at their 
aminotemilnus are ctoned into the 358-chloropiast targeted vector pCGN1761/CT. The 
fusions are made to the SphI site located at the cleavage site of the rbcS transit peptide* 
The expression cassettes thus created are transferred to appropriate transformation vectors 
(see above) and used to generate transgenic plants. As tryptophan, the precursor for 
pyaolnitrin biosynthesis, is synthesized in the chloroplast. it may be advantageous to 
express the biosynthetic genes for pynrolnitrin in the chloroplast to ensure a ready supply of 
substrate. Transgenic plants expressing all four ORFs will target all four gene products to 
the chloroplast and will thus synthesize pynrolnitrin in the chloropiasL 
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gY pression bPhind ffecS Wit h Chlomnlast TaraetinQ 

The pyrrolnltrin ORFs 1-4 amplKled using oKgonudeotides carrying an Sp/j/sHe at their 
aminoterminus are cloned Into the rfwS^hloroplast targeted vector pCGN1761rbcS/CT. 
The fusions are made to the SphI site located at the cleavage site of the rbcS transit 
peptide. The expression cassettes thus created are transfen^d to appropriate 
transfonnation vectors (see above) and used to generate transgenic plants. As tiyptophan. 
the precursor for pynolnitrin biosynthesis, is synthesized in the chloroplast. ft may be 
advantageous to express the blosynthetic genes for pyrrolnitrin In the chloroplast to ensure 
a ready supply of substrate. Transgenic plants expressing all four ORFs will target all four 
gene products to the chloroplast and wiU thus synthesize pyrrolnrtrin in the chloroplast The 
expression of the four ORFs will, however, be light induced. 

Example 42: Expression of Soiaphen in Transgenic Plante 

Clone p98/1 contains the entirety of the soraphen blosynthetic gene 0RF1 which encodes 
five blosynthetic modules for soraphen biosynthesis. The partially sequenced 0RF2 
contains the remaining three modules, and hirther required for soraphen biosynthesis is the 
soraphen methyiase located on the sanw operon. 

Soraphen 0RF1 is manipulated for expression in transgenic plants in the following manner. 
A DMA fragment is amplified from the aminotenninus of 0RF1 using PGR and p98/1 as 
template. The 5' oligonucleotide primer includes either an SphI srte or an Ncol site at the 
ATG for cloning into the vectors pCGNI 761 SENX or pCGNNENX respectively. Further, the 
5' oligonucleotide includes either the base C (for SphI ctoning) or the base G (for Nool 
cloning) immediately after the ATG. and thus the second amino acid of the protein is 
changed efther to a histldine or an aspartate (other amino adds can be selected for position 
2 by additionally changing other bases of the second codon). The 3' oligonucleotide for the 
amplification is located at the first Bglll sfte of the ORF and incorporates a distel Ecoff/site 
enabling the amplified fragment to be cleaved with SphI (or Ncoli and EcoRI. and then 
cloned into pCGN1761SENX (or pCGN1761NENX). To facilitate cleavage of the amplified 
fragmente, each oligonucleotide includes several additional bases at its 5* end. The 
oligonucleotides preferably have 12-30 bp homology to the 0RF1 template. In addition to 
the required restriction srtes and addftional sequences. This manipulation fuses the 
aminotemiinal -112 amino acids of 0RF1 at fts ATG to the SphI or Ncol sites of the 
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translation optimteed vectors pCGN1761SENX or pCGN1761NENX in linkage to the double 
35S promoter. The remainder of 0RF1 is carried on three Bglll fragments which can be 
sequentially cloned into the unique Bp/// site of the above-detailed constnictoons. The 
introduction of the first of these fragments is no problem, and requires only the cleavage of 
the aminotemtinal constniction with e^W followed by Introduction of the first of these 
fragments. For the introduction of the two remaining fragments, partial digestion of the 
aminotemiinal constmction is required (since this construction now has an additional Bglll 
stte) followed by introduction of the next ^///fragment. Thus, it is possible to construct a 
vector containing the entire -25 kb of soraphen 0RF1 in operable fusion to the 35S 
promoter. 

An alternative approach to constmcting the soraphen 0RF1 by the fusion of sequential 
restriction fragments is to amplify the entire ORF using PGR. Barnes (Proc. Natl. Acad. Sa 
USA 91- 2216-2220 (1994)) has recently described techniques for the high-fidelity 
amplification of fragments by PGR of up to 35 kb. and these techniques can be applied to 
0RF1. Oligonucleotides specific for each end of 0RF1. with appropriate restriction sites 
added are used to amplify the entire coding region, which is then ctoned into appropnate 
sites in a suitable vector such as pCGN1761 or its derivatives. Typically after PGR 
amplification, resequencing fe advised to ensure that no base changes have arisen in the 
amplified sequence. Alternatively, a functional assay can be done directly In transgenic 
plants. 

Yet another approach to the expression of the genes for polyketide biosynthesis (such as 
soraphen) in transgenic plants is the oonstmdton. for expresston In plants, of transcriptional 
units which comprise less than the usual complement of modules, and to provide the 
remaining modules on other transcriptional units. As it is beHeved that the biosynthesis o 
polyketide antibiotics such as soraphen is a process which requires the sequential activity of 
specHic modules and that for the synthesis of a specific molecule these activities should be 
provided in a specific sequence. It is likely that the expression of different transgenes in a 
plant carrying ditterent modules may lead to the biosynthesis of novel polyketide molecules 
because the sequential enzymatic nature of the wTd-type genes is detemilned by their 
configuration on a single molecule. It Is assumed that the localization of five specific 
modules for soraphen biosynthesis on ORFI Is detemOnatory in the biosynthesis of 
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soraphen. and that the expression of. say three modules on one transgene and the otfier 
two on another, together with 0RF2. may result in biosynthesis of a polyketide w.tti a 
different molecular structure and possibly with a different antipathogenlc acbvity. Th.s 
Invention encompasses all such deviations of module expression which may result .n the 
synthesis in transgenic oiganlsms of novel polyketides. 

Although specHic constmction details are only provided for 0RF1 above, similar techniques 
are used to express 0RF2 and the soraphen methylase in transgenic plants. For the 
expression of functional soraphen in plants it is anticipated that all three genes must be 
expressed and this is done as detailed In this specification. 

Fusions of the kind described above can be made to any desired promoter with or without 
modification (e.g. for optimized translational initiation in plants or for enhanced expresswn). 
As the ORFs identified for soraphen biosynthesis are around 70% GC rich It is not 
anticipated that the coding sequences should require modification to Increase GC content 
for optimal expression in plants. It may. however, be advantageous to modify the genes to 
Include oodons preferred In the appropriate target plant species. 

Example 43: Expression of Phenazlne In Transgenic Plants 

The GC content of all the cloned genes encoding biosynthetic enzymes for phenazine 
synthesis is between 58 and 65% and consequently no AT-content related problems are 
anticipated with their expression in plants (although ft may be advantageous to modrfy the 
genes to include codons preferred in the appropriate target plant species.). Fuswns of the 
kind described below can be made to any desired promoter wHh or without modification 
{e.9. for optimized translational initiation in plants or for enhanced expression). 

Fx pressioP hahind the 3 SS Promoter 

Each of the three phenazine ORFs is transferred to pBluescript SK II for further 
manipulation. The phzB ORF is transferred as an EcoRl-Bglll fragment cloned from 
plasmid pLSP18-6H3del3 containing the entire phenazine operon. This fragment is 
transferred to the EcoRI-BamHI sites of pBluescript SK II. The phzC ORF Is transferred 
from pLSP18-6H3del3 as an XholScal fragment cloned Into the XholSmal sites of 
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pBluescript II SK. The phzD ORF is transferred from pLSP18-6H3del3 as a Bglll-Hindlll 
fragment Into the BamHI-HindlllsiUss of pBluescript II SK. 

Destruction of Internal restriction sites which are required for further construction is 
undertaken using the procedure of Inslde-outside PGR" described above (Innes et al. PGR 
Protocols: A guide to methods and applications. Academic Press. New York (1990)). In the 
case of the ptaB ORF tww) Sphl sites are destroyed (one site located upstream of the ORF 
is left Intact). The first of these is destroyed using the unique restriction sites EcoRl (left of 
the Sphl site to be destroyed) and Bdl (right of the Sphi site). For this manipulation to be 
successful, the DMA to be Bdl cleaved for the final assembly of the inside-outside PGR 
product must be produced In a dannnlnus E. coS host such as SGS1 10 (Stratagene). For 
the second phzB Sphl sites, the selected unk^ue restriction sites are PstI and Spel, the 
latter being beyond the phzB ORF in the pBluescript polylinker. The ptaC ORF has no 
Internal S^/ sites, and so this procedure is not required for phzC. The phzD ORF. 
however, has a single Sphls^ which can be removed using the unique restriction sites 
Xmal and Hindlll (the XmaWSma/ site of the pBluescript polylinker is no longer present due 
to the insertion of the ORF between the BamHI and Hindlll sites). 

The removal of Sphl sites from the phenazine biosynthetic genes as described above 
facilitates their transfer to the pCGN1761SENX vector by amplification using an 
aminoterminal oligonucleotide primer which incorporates an Sphl site at the ATG and a 
carboxyterminal primer which Incorporates a restriction site not found in the gene being 
amplified. The resultant amplified fragment is cleaved with Sphl the restriction enzyme 
cutting the carboxytenninal sequence and cloned into pCGN1761SENX. Suitable restriction 
enzyme sites for incorporation into the carboxytenninal primer are EcoRI and NotI (for all 
three ORFs; NotI will need checking when sequence complete), and Xhol (for phzB and 
phzD). Given the requirement for the nucleotide G at position 6 within tiie Sphl recognition 
site, in some cases the second codon of the ORF may require changing so as to start with 
the nucleotide 0. This construction fuses each ORF at its ATG to the Sphl sites of the 
translation-optimized vector pGGN1761SENX in operable linkage to the double 35S 
promoter. After construction is complete the final gene insertions and fusion points are 
resequenced to ensure that no undesired base changes have occurred. 
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By utilizing an aminoterminal oligonudeotide primer which incorporates an Ncol site at its 
ATG instead of an SphI sHe. the three phz ORFs can also be easily cloned into to the 
translation^ptlmized vector pCGN1761NENX. None of the three phenazine biosynthetic 
gene OPFs cany an Ncol site and consequently there is no requirement in this case to 
destroy internal restriction sites. Primers for the carboxytenninus of the gene are designed 
as described above and the cloning is undertaken in a similar fashion. Given the 
requirement for the nucleotide G at position 6 within the Ncol recognition site, in some 
cases the second codon of the ORF may require changing so as to start with the nucleotide 
G. This construction fuses each ORF at its ATG to the Ncol site of pCGN1761NENX in 
operable linkage to the double 35S promoter. 

The expression cassettes of the appropriate pCGNI 761 -derivative vectors are transferred 
to transfonnation vectors. Where possible multiple expression cassettes are transferred to 
a single transformation vector so as to reduce ttie number of plant transformations and 
crosses between transformants which may be required to produce plants expressing all four 
ORFs and thus producing phenazine. 

gxtiression behind 35S with Chloroola st Taroetinq 

The three phenazine ORFs amplified using oligonucleotides carrying an SpW sHe at their 
aminoterminus are cloned into tie 35S-chloroplast targeted vector pCGN1761/CT. The 
fusions are made to the SphI site located at the cleavage site of the /bcS transit peptide. 
The expression cassettes tiius created are transferred to appropriate transformation vectors 
(see above) and used to generate transgenic plants. As chorismate. the likely precursor for 
phenazine biosyntiiesis. is syntiiesized in tiie chloroplast. it may be advantageous to 
express the biosynthetic genes for phenazine in the chtoroplast to ensure a ready supply of 
substrate. Transgenic plants expressing all tiiree ORFs will target all three gene products to 
tiie chloroplast and will ttius syntiiesize phenazine In the chtoroplast 

Expression hahind rf?cS with Chloroolast Taroetinq 

The tiiree phenazine ORFs amplified using oligonucleotides carrying an ^/i/ site at their 
aminotemiinus are cloned into tiie *cS-chloroplast targeted vector pCGN1761rbcSA:T. 
The fusions are made to tiie SphI site located at ttie cleavage site of tiie iteS transit 
peptide. The expression cassettes thus created are transferred to appropriate 
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transformation vectors (see above) and used to generate transgenic plants. As chorismate, 
the likely precursor for phenazine biosynthesis, is synthesized in the chloroplast, it may be 
advantageous to express the biosynthetic genes for phenazine in the chloroplast to ensure 
a ready supply of substrate. Transgenic plants expressing all three ORFs will target all four 
gene producte to the chloroplast and will thus synthesize phenazine in the chloroplast The 
expression of the three ORFs will, however, be light induced. 

Example 44: Expres^on of the Non-Rlbosomally Synthesized Peptide Antibiotic 
Gramicidin in Transgenic Plants 

The three Bacillus brevis gramicidin biosynthetic genes grsA, grsB and grsT have been 

previously cloned and sequenced (Turgay et al. Mol l^crobioL 6: 529-546 (1992); 

Kraetzschmar et al. J. BacterioL 171: 5422-5429 (1989)). They are 3296. 13358, and 770 

bp in length, respectively. These sequences are also published as GenBank accession 

numbers X61658 and M29703. The manipulations described here can be undertaken using 

the publicly available clones published by Turgay et al. (supra) and Kraetzschmar et al. 

{supra), or alternatively from newly isolated clones from Bacillus breyns isolated as 

described herein. 

Each of the three ORFs grsA, grsB, and grsTis PGR amplified u^ng oligonucleotides which 
span the entire coding sequence. The leftvrard (upstream) oligonucleotide includes an SstI 
site and tiie rightward (downstream) oligonucleotide Includes an Xho/slte. These restriction 
sites are not found within any of the tiiree coding sequences and enable the amplified 
products to be cleaved viritii SstI and Xhol for insertion into tiie corresponding sites of 
pBluescript II SK. This generates the clones pBL-GRSa. pBLGRSb and pBLGRSt. The CG 
content of tiiese genes lies between 35 and 38%. Ideally, the coding sequences encoding 
the tiiree genes may be remade using tiie techniques refened to in Section K, however it is 
possible ttiat the unmodified genes may be expressed at high levels in transgenic plants 
without encountering problems due to their AT content In any case it may be 
advantageous to modify the genes to include codons preferred in the appropriate target 
plant species. 

The ORF grsA contains no SphI site and no Ncol site. This gene can be thus anv>iifled 
from pBLGSRa using an aminoterminal oligonucleotide which incorporates eitiier an SphI 
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site or an Ncol site at the ATG. and a second carboxyterminal oligonucleotide which 
incorporates an XholsHe, thus enabling the amplification product to be cloned direcBy into 
pCGN17B1SENX or pCGN1761 NENX behind the double 35S promoter. 
The ORF grsB contains no Ncol site and therefore this gene can be amplified using an 
aminoterminal oligonucleotide containing an Ncol stte in the same way as described above 
for the grsA ORF; the amplified fragment is cleaved with Ncol and Xhol and ligated into 
pCGNI 761 NENX. However, the grsB ORF contains three SphI sites and these are 
destroyed to facilitate ttie subsequent cloning steps. The sites are destroyed using the 
inside-outside" PGR technique described above. Unique cloning sites found within the 
grsB gene but not within pBluescript II SK are EcoNh PflMI, and RsrII. Either EcoNI or 
PaMI can be used together with RsrII to remove the first two sites and RsrII can be used 
together with the Apal site of the pBluescript potylinker to remove the third site. Once these 
sites have been destroyed (without causing a change in amino add), tiie entirety of the 
grsB ORF can be amplified using an aminotenninal oligonucleotide including an SphI site at 
the ATG and a cariDOxytenninal oligonucleotide incorporating an Xhol site. The resultant 
fragment is cloned into pCGN1761SENX. In order to successfully PCR-amplify fragments 
of such size, amplification protocols are modified in view of Barnes (1994, Proc. Nati. Acad. 
Sci USA 91: 2216-2220 (1994)) who describes the high fidelity amplification of large DNA 
fragments. An alternative approach to the transfer of ttie grsB ORF to pCGN1761SENX 
wittiout necessitating the destruction of the three SphI restriction sites involves the transfer 
to the SphI and Xhol cloning sites of pCGN1761SENX of an aminoterminal fragment of 
grsB by amplification from the ATG of tiie gene using an aminotenninal oligonucleotide 
which incorporates a Sp/J/ site at tiie ATG. and a second oligonucleotide which is adjacent 
and 3* to the PflMI site in ttie ORF and which includes an Xhol site. TTius tiie 
aminotenninal amplified fragment is cleaved wWi SphI and Xhol and cloned into 
pCGN1761SENX. Subsequentiy the remaining portion of the grsB gene is exdsed from 
pBLGRSb using PnMI and Xhol (which cuts in ttie pBluescript polylinker) and cloned into 
ttie aminotenninaJ carrying construction cleaved witii PflMI and Xhol to reconstitute ttie 
gene. 

The ORF grsT contains no SphI site and no Ncol site. This gene can be thus amplified 
from pBLGSRt using an aminotemiinal oligonucleotide which incorporates eittier an SphI 
site or an Ncol site at tiie initiating codon which is changed to ATG (from GTG) for 
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expression in plants, and a second carboxytemiinal oligonucleotide which incorporates an 
Xhol site, thus enabling the amplification product to be doned directly into pCGN1761SENX 
or pCGNI 761 NENX behind the double 35S promoter. 

Given the requirement for the nucleotide C at position 6 within the Spht recognition site, and 
tile requirement for ttie nucleotide G at position 6 wittiin the Ncol recognition site, in some 
cases the second codon of the ORF may require changing so as to start «dtii the 
appropriate nucleotide. 

Transgenic plants are created which express all tiiree gramicidin biosynthetic genes as 
described elsewhere in ttie qaecification. Transgenic plants expressing all ttiree genes 
syntiiesize gramicidin. 

Example 45: Expression of the Ribosomaliy Synthesized Peptide Lantibiotic 
Epldermin in Tiansgenic Plants 

The eplA ORF encodes the structural unit for ei^ermin biosyntiiesis and is apprDximately 

420 bp in ier^th (GenBanic Accession No. X07840; Schnell et sJ. Nature 276-278 

(1988)). This gene can be subcloned using PGR tediniques frcwn tiie piasmid pT032 into 

pBluescript SK II using oHgonucleotides canying the terminal restriction sites BamHI{S') and 

PstI (3*). The epM gene sequence has a GC content of 27% and this can be increased 

using techniques of gene syntitesis referred to elsewhere in this specification; this 

sequence modification may not be essential, however, to ensure high-level expression in 

plants. Subsequentiy the epM ORF is transferred to tiie cloning vector pCGN1761SENX or 

pCGN1 761 NENX liy PGR amplification of the gene using an aminotenninal ofigonucelotide 

spanning ttie initiating mettiionine and canying an Spht site (for cloning into 

PCGN1761SENX) or an Ncol site (for cloning into pGGNI 761 NENX), togettier witti a 

carboxyterminai oligonucleotide carrying an EcoRI. a Atotf, or an Xhol site for doning into 

eittier pCGNI 761 SENX or pGGN1 761 NENX. Given the requirement for the nucleotide G at 

position 6 wittiin the SphI recognition site, and the requirement for the nudeotide G at 

position 6 wittiin ttie /Vco/ recognition site, in some cases ttie second codon of ttie ORF may 

require changing so as to start with ttie appropriate nudeotide. 

Using doning techniques descnljed in ttiis spedfication or well ioiown in ttie art ttie 
remaining genes of ttie epi operon (viz. epiB, epXi.^D, epiQ. and epIPi are subdoned 
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from plasmid pTu32 into pBluescript SK II. These genes are responsible for the 
modification and pol/merization of the epM-encoded structural unit and are described in 
Kupke etaL (J. Bacteriol. 174: 5354-5361 (1992)) and Schnell etal. (Eur. J. Biochem. 204 : 
57-68 (1992)). The subcloned ORFs are manipulated for transfer to pCGNI 761 -derivative 
vectors as described above. The expression cassettes of the appropriate pCGN1761- 
derivative vectors are transfen^ed to transformation vectors. Where possible multiple 
expression cassettes are transferred to a single transformation vector so as to reduce the 
number of plant transfomiations and crosses between transformants which may be required 
to produce plants expressing all required ORFs and thus producing epidermin. 

L Analysis of Transgenic Plants for APS Accumulation 
Example 46: Analysis of APS Gene Expression 

Expression of APS genes in transgenic plants can be analyzed using standard Northern 
blot techniques to assess the amount of APS mRNA accumulating in tissues. Alternatively, 
the quantity of APS gene product can be assessed by Western analysis using antisera 
raised to APS biosynthetic gene products. Antisera can be raised using conventional 
techniques and proteins derived from the expression of APS genes in a host such as E 
colL To avoid the raising of antisera to multiple gene products from £. col! expressing 
multiple APS genes from multipie ORF operons, the APS biosynthetic genes can be 
expressed individually in £« colL Altematively, antisera can be rateed to synthetic peptides 
designed to be homologous or identical to known APS biosynthetic predicted amino acid 
sequence. These tediniques are well known in the art 

Example 47: Analysis of APS Production in Transgenic Plants 

For each APS, known protocols are used to detect production of the APS in transgenic 
plant tissue. These protocols are available in the appropriate APS literature. For 
pyrrolnitrin, tiie procedure desoibed in example 11 is used, and for soraphen the procedure 
described in example 17. For phenazine detemiination, the procedure described in 
example 18 can be used. For non-n'bosomal peptide antibiotics such as gramicidin S. an 
appropriate general technique is the assaying of ATP-PPi exchange, in tiie case of 
gramicidin, the grsA gene can be assayed by phenylalanine-dependent ATP-PP| exchange 



wo 95/33818 



PCT/IB95/00414 



-112- 

and the grsB gene can be assayed by proline, valine, ornithine, or leucine-dependent ATP- 
PPi exchange. Alternative techniques are described by Gause & Brazhnikova (Lancet 247 : 
715 (1944)). For ribosomally synthesized peptide antibiotics isolation can be achieved by 
butanol extraction, dissolving in methanol and diethyl ether, followed by chroniatography as 
described by Allgaier et at. for epidennin (Eur. Ju. Biochem. 160: 9-22 (1986)). For many 
APSs (e.g. pyrrolnitrin, gramicidin, phenazine) appropriate techniques are provided in the 
Merck Index (Merck & Co,, Rahway. NJ (1989)). 

M- Assay of Disease Resistance in Transgenic Plants 

Transgenic plants expressing APS biosynthetic genes are assayed for resistance to 
phytopathogens using techniques well known in phytopathology. For foliar pathogens, 
plants are grown in the greenhouse and at an appropriate stage of development inoculum 
of a phytopathogen of interest is introduced at in an appropriate manner. For soil-bome 
phytopathogens, the pathogen is normally introduced into the soil before or at the time the 
seeds are planted. TTie choice of plant cultivar selected for introduction of the genes will 
have taken into account relative phytopathogen sensitivity. Thus, it is prefenred that the 
cultivar chosen \i^ll be susceptible to most phytopathogens of interest to allow a 
detemiination of enhanced resistance. 

Assay of Resistance to Foliar Phytopathogens 

Example 48: Disease Resistance to Tobacco Foliar Phytopathogens 

Transgenic tobacco plants expressing APS genes and shown to poduce APS compound 

are subjected to the following disease tests. 

Phytophthora parasltica/Blaok shank Assays for resistance to Phytophthora parasitica, 
the causative organism of black shank are perfonned on six-week-old plants grown as 
described in Alexander ef a/.. Pro. Natl. Acad. Sci. USA 90: 7327-7331 . Plants are watered, 
allowed to drain well, and then inoculated by applying 10 mL of a sporangium suspension 
(300 sporangia/mL) to the soil. Inoculated plants are kept in a greenhouse maintained at 
23-25 C day temperature, and 20-22 C night temperature. The wilt index used for the 
assay is as follows: 0 = no symptoms; 1 = some sign of wilting, with reduced turgidity; 2 = 
clear wilting symptoms, but no rotting or stunting; 3 = clear wilting symptoms with stunting, 
but no apparent stem rot; 4 = severe wilting, with visible stem rot and some damage to root 
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system; 5 =: as for 4, but plants near death or dead, and with severe reduction of root 
system. All assays are scored blind on plants anrayed in a random design. 

Pseudomonas syringae Pseudomonas syringae pv. tabaci (strain #551) is injected into 

6 6 

the two lower leaves of several 6-7 week old plants at a concentration of 10 or 3 x 10 per 
ml in H2O. Six individual plants are evaluated at each time point. Pseudomonas tabaci 
infected plants are rated on a 5 point disease severity scale, 5 = 100% dead tissue, 0 = no 
symptoms. A T-test (LSD) is conducted on the evaluations for each day and tiie groupings 
are indicated after the Mean disease rating value. Values followed by tiie same tetter on 
that day of evaluation are not statistically significantiy different. 

Cercospora nicotlanae A spore suspension of Cercospora nicotianae (ATCC #18366) 
(100,000-150,000 spores per ml) is sprayed to imminent run-off on to the surface of the 
leaves. The plants are maintained in 100% humidity for five days. Thereafter the plants are 
misted wtti H2O 5-10 times per day. Six individual plants are evaluated at each time point. 
Cercospora nicotianae Is rated on a % leaf area showing disease symptoms basis. A T*test 
(LSD) is conducted on the evaluations for each day and the groupings are indicated after 
the Mean disease rating value. Values followed by the same letter on that day of evaluation 
are not statistically significantiy different 

Statistical Analyses All tests Include non*transgenic plants (six plants per assay, or the 
same cultivar as ttie transgenic lines) (Alexander ef a/.. Pro. Nati. Acad. Sci. USA 90: 7327- 
7331). Painvise T-tests are performed to compare different genotype and treatment groups 
for each rating date. 

Assav of Resistance to Soll-Bome Phvtopathoaens 
Example 49: Resistance to Rhizoctonia solani 

Plant assays to determine resistance to Rttizodonki solani are conducted by planting or 
transplanting seeds or seedlings into naturally or artifidally Infested soil. To create 
artif idaily infested soil, millet rice, oat, or other similar seeds are first moistened witti water, 
then autoclaved and inoculated with plugs of the fungal phytopatiiogen taken from an agar 
plate. When the seeds are fully overgrown witti the phytopatiiogen, they are air-dried and 
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ground into a powder. The powder is mixed into soil at a rate experimentally determined to 
cause disease. Disease may be assessed by comparing stand counts, root lesions ratings, 
and shoot and root weights of transgenic and non-transgenic plants grown in the infested 
soil. The disease ratings may also be compared to the ratings of plants grown under the 
same conditions but without phytopathogen added to the soil. 

Example 50: Resistance to Pseudomonas solanacearum 

Plant assays to determine resistance to Pseudomonas solanacearum are conducted by 
planting or transplanting seeds or seedlings into naturally or artificially infested soil. To 
create artificially infested soil, bacteria are grown in shake flask cultures, then mixed into the 
soil at a rate experimentally. determined to cause disease. The roots of the plants may 
need to be slightly wounded to ensure disease development Disease may be assessed by 
comparing stand counts, degree of wilting and shoot and root weights of transgenic and 
non-transgenic plants grown in the infested soil. The disease ratings may also be 
compared to the ratings of plants grown under the same conditions but without 
phytopathogen added to the soil. 

Example 51: Resistance to Sofl-Borne Fungi which are Vectois for Virus 
Transmission 

Many soil-tiome Potymyxa. Olpidium and Spongo^ra species are vectors for the 
transmission of viruses. These include (1) Polymyxa betae which transndts Beet Necrotic 
Yellow Vein Vims (the causative agent of rhizomania disease) to sugar beet, (2) Polymyxa 
gramlnis which transmits Wheat Soil-Borne Mosaic Virus to wheat, and Bariey Yellow 
Mosaic Vims and Bariey Mild Mosaic Vims to bartey, (3) Olpidium brassicae which transmits 
Tobacco Necrosis Vims to tobacco, and (4) Spongospora subtenanea which transmits 
Potato Mop Top Vims to potato. Seeds or plants expressing APSs in their roots {e.g. 
constitutively or under root specific expression) are sown or transplanted in sterile soil and 
fungal inocula canying the vims of interest ajte introduced to the soil. After a suitable time 
period the transgenic plants are assayed for viral symptoms and accumulation of vims by 
ELISA and Northern blot Control experiments involve no inoculation, and inoculation v^nth 
fungus which does not carry the vims under investigation. The transgenic plant lines under 
analysis should ideally be susceptible to the vims in order to test the efficacy of the APS- 
based protection. In the case of vimses such as Bariey Mild Mosaic Vims which are both 
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Pa/yrnyxa-transmitted and mechanically transmissible, a further control is provided by the 
successful mechanical introduction of the viais into plants which are protected against soil- 
infection by APS expression in roots. 

Resistance to viais-transmitting fungi offered by expression of APSs will thus prevent virus 
infections of target crops thus inf4)r6ving plant health and yield. 

Example 52: Resistance to Nematodes 

Transgenic plants expressing APSs are analyzed for resistance to nematodes. Seeds or 
plants expressing APSs in their roots {e.g. constitutively or under root specific expression) 
are sown or transplanted in sterile soil and nematode inocula canrying are introduced to the 
soil. Nematode damage is assessed at an appropriate time point. Root Icnot nematodes 
such as Melokiogyne spp. are introduced to transgenic tobacco or tomato expressing APSs. 
Cyst nematodes such as Heterodera spp. are introduced to transgenic cereals, potato and 
sugar beet. Lesion nematodes such as Pratylenchus spp. are introduced to transgenic 
soybean, alfalfa or com. Reniform nematodes such as Rolylenchulus spp. are introduced 
to transgenic soybean, cotton, or tomato. Dh^nchus spp. are introduced to transgenic 
alfalfa. Detailed techniques for screening for resistance to nematodes are provided in Starr 
(Ed.; Methods for Evaluating Plant Species for resistance to Plant ParasWc Nematodes, 
Society of Nematologists, Hyattsville, i^^land (1990)) 

Examoles of Imoortant Phvtopathooens in Aoricultural Croo Soecies 
Example 53: Disease Resistance in Maize 

Transgenic maize plants expressing APS genes and shown to poduce APS compound are 
subjected to the following disease tests. Tests for each phytopathogen are conducted 
according to standard phytopathoiogical procedures. 

Leaf Diseases and Stalk Rots 

(1 ) Northern Com Leaf Blight {Helminthosporium turdcumf syn. ExserohUum turdcum). 

(2) Anthracnose {Colletotrichum gmminicolat'^me as for Static Rot) 

(3) Southern Com Leaf Blight (Helminlhosporium maydisf syn. Bipolaris maydis). 

(4) Eye Spot {KabatieOazeae) 

(5) Common Rust (Pucdnia sorghi^. 
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ee) Southern Rust (Puccinia polysora). 

(7) Gray Leaf Spot (Cercospora zeae-maydisf and C. sorghli 

(8) Stalk Rots (a complex of two or more of the following pathogens-PK</j/um 
aphanidermatumf-eariy, Erwinia chrysanthernhzeae-eafiy. Colletotrichum 
graminicolat Diplodia maydisf. D. macro^ora, GIbberella zeaet Fusarium 
momliformet Macrophomina phaseolina, Cephalosporium acremonium) 

(9) Goss* Disease (Clavibacter nebraskanense) 

Important-Ear Molds 

(1 ) Gibberella Ear Rot {Gibberella zeaef-same as for Stalk Rot) 
Aspergillus flavus, A. parasiticus. Aflatoxin 

(2) Diplodia Ear Rot {Diplodia maydisf and D. macrospora-same organisms as for Stalk Rot) 

(3) Head Smut {Sphacelotheca feiliana-syn. Ustilago reiliana) 

Example 54: Disease Resistance in Wheat 

Transgenic wheat plants expressing APS genes and shown to poduce APS compound are 
subjected to the following disease tests. Tests for each pathogen are conducted according 
to standard phytopathological procedures. 

(1 ) Septoria Diseases {Septoria tntici, S. nodonsm) 

(2) Powdery Mildew {Erysiphe graminis) 

(3) Yellow Rust {Puccinia striiformis) 

(4) Brown Rust {Puccinia recondita, P. tiordei) 

(5) Others-Brown Foot Rot/Seedling Blight {Fusarium culmonsm and Fusarium roseum ). 
Eyespot {Pseudocercosporella herpotrichoides), Take-All {Gaeumannomyces 
graminis) 

(6) Vinjses (barley yellow mosaic vims, barley yellow dwarf vims, wheat yellow mosaic vims). 

N- Assay of Btocontrol Efficacy In Microbial Strains Expressing APS Genes 
Example 55: Protection of Cotton against Rhizoctonia solan! 

Assays to detennine protection of cotton from infection caused by Rhizoctonia solani are 
conducted by planting seeds treated with the biocontrol strain in naturally or artifidally 
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infested soil. To create artificially infested soil, millet, rice, oat, or other similar seeds are 
first moistened with water, then autodaved and inoculated with plugs of the fungal 
pathogen taken from an agar plate. When the seeds are fully overgrown with the pathogen, 
they are air-dried and ground Into a powder. The powder is mixed into soil at a rate 
experimentally detemiined to cause disease. This infested soil is put into pots, and seeds 
are placed in furrows 1.5cm deep. The biocontrol strains are grown in shake flasks in the 
laboratory. The cells are harvested by centrifugation. resuspended in water . and then 
drenched over the seeds. Control plants are drenched with water only. Disease may be 
assessed 14 days later by comparing stand counts and root lesions ratings of treated and 
nontreated seedlings. The disease ratings may also be compared to the ratings of 
seedlings grown under the same conditions but without pathogen added to the soil. 

Example 56: Protection of Potato against aavlceps miehlganese subsp. 

Clayiceps mich^anese subsp. speedonimm is the causal agent of potato ring rot disease 
and is typically spread before planting when '^eetf* potato tubers are knife cut to generate 
more planting material. Transmission of the pathogen on the surface of the knife results in 
the inoculation of entire 'teeed" batches* Assays to detamnine protection of potato from the 
caused agent of ring rot disease are conducted by inoculating pots^o seed pieces with both 
the pathogen and the biocontrol strain. The pathogen is introduced by first cutting a 
naturally infected tuber, then using the knife to cut other tubers into seed pieces. Next, the 
seed pieces are treated with a su^nsion of biocontrol bacteria or water as a control. 
Disease is assessed at the end of the growing season by evaluathg plant v^or, yield, and 
number of tubers infected with ClavSjacten 



0. Isolation of APSs from Organisms Exoresslna the Cloned Genes 
Example 57: Extraction Procedures for APS Isolation 

Active APSs can be isolated from the cells or growth medium of wild-type of transfomied 
strains that produces the APS. This can be undertaken using known protocols for the 
isolation of molecules of known characteristics. 
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For example, for APSs which contain multiple benzene rings (pyrrolnitrin and soraphen) 
cultures are grown for 24 h in 10 ml L broth at an appropriate temperature and then 
extracted with an equal volume of ethyl acetate. The organic phase is recovered, allowed 
to evaporated under vacuum and the residue dissolved in 20 I of methanol. 

In the case of pyaolnitrin a further procedure has been used successfully for the extraction 
of the active antipathogenic compound from the growth medium of the transfomied strain 
producing this antibiotic. This is accomplished by extraction of the medium with 80% 
acetone followed by removal of the acetone by evaporation and a second extraction with 
diethyl ether. The diethyl ether is removed by evaporation and the dried extract is 
resuspended in a small volume of water. Small aliquots of the antibiotic extract applied to 
small sterile filter paper discs placed on an agar plate will inhibit the growth of Rhizoctonia 
solanl indicating the presence of the active antibiotic compound. 

A preferred method for phenazine isolation is described by Thomashow et aL (AppI Environ 
Microbiol 56: 908-912 (1990)). This involves acidifying cultures to pH 2.0 with HCI and 
extraction with benzene. Benzene fractions are dehydrated with Na2S04 and evaporated to 
dryness. The residue is redissolved in aqueous 5% NaHCOa, reextracted with an equal 
volume of benzene, acidified, partitioned into benzene and redried. 

For peptide antibiotics (which are typically hydrophobic) extraction techniques using 
butanol, methanol, chlorofomn or hexane are suitable, in the case of gramiddin, isolation 
can be carried out according to the procedure described by Gause & Brazhnikova (Lancet 
247 : 715 (1944)). For epidemnin. the procedure described by Allgaier et aL for epidermin 
(Eur. Ju. Biochem. 160: 9-22 (1986)) is suitable and involves butanol extraction, and 
dissolving in methanol and diethyl ether. For many APSs (e.p. pyrrolnitrin, gramiddin, 
phenazine) appropriate techniques are provided in the Merck Index (Merck & Co., Rahway, 
NJ (1989)). 

P. Formulation and Use of isolated Antibiotics 

Antifungal formulations can be made using active ingredients which comprise either the 
isolated APSs or alternatively suspensions or concentrates of cells which produce them. 
Fonnuiations can be made in liquid or solid form. 
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Example 58: Liquid Formulation of Antifungal Compositions 



In the following examples, percentages of composition are given by weight: 




1. Emulslfiable concentrates: 


a 


b 


e 




Active ingredient 


20% 


40% 


50% 




Calcium dodecylbenzenesulfonate 


5% 


8% 


e% 




Castor oil polyethlene glycol 


5% 


- 






ether (36 moles of ethylene oxide) 










Tributytphenol polyethylene glyco 




12% 


4% 




ether (30 moles of ethylene oxide) 










Cyciohexanone 




15% 


20% 




Xylene mixture 


70% 


25% 


20% 




Emulsions of any required concentration can be produced 


from such ( 


Mncentrates by . 


dilution with water. 










2. Solutions: 


a 


b 


c 


d 


Active ingredient 


80% 


10% 


5% 


9S% 


Ethjdene glycol monomethyl ether 


20% 








Polyethylene glycol 400 




70% 






N-methyl-2-pyrrolidone 




20% 






Epoxidised coconut oil 






1% 


5% 


Petroleum distillate 






94% 




(boiling range 160-190'') 










These solutions are suitable for application in the form of microdrops. 






3. Granulates: 


a 


b 






Active ingredient 


5% 


10% 






Kaolin 


94% 








Highly dispersed silicic add 


1% 









Attapulgit - 90% 



The active ingredient is dissolved in meth^ene chloride, the solution is sprayed onto the 
carrier, and the solvent is sufc>sequentiy evaporated off in vacuo. 
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4. Du$ts: a b 

Active ingredient 2% 5% 

Highly dispersed silicic acid 1 % 5% 

Talcunr) 97% 

Kaolin - 90% 

Ready-toHise dusts are obtained by intimately mixing ttie cam'ers with the active ingredient. 

Example 59: Solid Fdrmuiation of Antifungal Compositions 

In the following examples, percentages of compositions are by weight 

1. Wettable powders: a b e 

Active ingredient 20% 60% 75% 

Sodium lignosulfonate 5% 5% 

Sodium lauryl sulfate 3% ' - . . 5% 

Sodium diisobutylnaphthalene sulfonate - 6% 10% 

Octyiphenol polyethylene glycol ether - 2% - 

(7-8 moles of ethylene oxide) 

Highly dispersed silicic acid 5% 27% 1 0% 

Kaolin 67% 

The active ingredient is thoroughly mixed with the adjuvants and the mixture is thoroughly 
ground in a suitable mill, affording wettable powders which can be diluted with water to give 
suspensions of ttie desired concentrations. 



2. Emulsffiable concentnaite: 

Active ingredient 10% 

Octyiphenol polyethylene glycol ether 3% 

(4*5 moles of ethylene oxide) 

Calcium dodecylbenzenesulfonate 3% 

Castor oil polyglycol ether 4% 

(36 moles of ethylene oxide) 

Cyciohexanone 30% 

Xylene mixture 50% 
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Emulsions of any required concentration can be obtained from tliis concentrate by dilution 
with water. 

3. Dusts: a b 
Active ingredient 5% 8% 
Talcum 95% 

Kaolin - 92% 

Ready-to-use dusts are obtained by mixing the active ingredient virith the caniens, and 
grinding the mixture in a suitable milt. 

4. Extrader granulate: 

Active ingredient 1 o% 

Sodium lignosulfonate 2% 

Carboxymethylcellulose - 1% 

Kaolin 87% 

The active ingredient is mixed and ground with the adjuvants, and the mixture is 
subsequently moistened with water. The mixture is extnided and then dried in a stream of 
air. ' 

5. Coated granulate: 

Active Ingredient 3% 
Polyethylene glycol 200 3% 
Kaolin 94% 

The finely ground active ingredient is uniformly applied, in a mixer, to the kaolin moistened 
with polyethylene glycol. Non<lusty coated granulates are obtained in this manner. 

6. Suspension concentrate: 

Active ingredient 40% 
Ethylene glycol 10% 
Nonylphenol polyethylene glycol 6% 
(15 moles of ethylene oxide) 
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Sodium lignosulfonate 

Carboxymethylcellulose 

37 % aqueous formaldehyde solution 

Silicone oil in 75 % aqueous emulsion 

Water 



0.8% 



0,2% 



32% 



10% 



1% 



The finely ground active ingredient is intimately mixed with the adjuvants, giving a 
suspension concentrate from which suspensions of any desire concentration can be 
obtained by dilution with water. 

While the present invention has been described with reference to specific embodiments 
thereof, it will be appreciated that numerous variations, modifications, and embodiments are 
possible, and accordingly, all such variations, modifications and embodiments are to be 
regarded as being within the spirit and scope of the present invention. 
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SBQOENCE LISTING 



(1) GENERAL IMFCS^MATION: 

(i) APPLICANT: 

(A) NAME: dBA-GEIQf AG 

(B) STBEET: Kl^ckstr. 141 

(C) CXTY: Basel 

(E) CX)DNTRY: Switzerland 

(F) POSTAL CODE (ZIP) : 4002 

(G) TELEPHONE: 441 61 69 11 11 

(H) TEISFAX: + 41 61 696 79 76 

(I) TELEX: 962 991 

(ii) TITIE OF INVENTION: Genes for the synthesis of 
antipathog^iic substances 

(iii) NUMBER OF SEQOENCES: 22 

(iv) OGMPUTER READABI£ FORM: 

(A) MEDICM TYPE: Floppy disk 

(B) COMPUTER: IBM PC cgnpatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DC^ 

(D) SOETHARE: Patentin Release #1.0, Version #1.25 (EPO) 



(2) INFOI^MATION FOR SBQ ID NO: 1: 

(i) SEQQENCE CBARACIERISTICS : 

(A) I£NGTH: 7000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECQLE TYPE: I3NA (gencadc) 

(iii) HYPOIHETIGAL: NO 

(iv) ANTI-SENSE: ND 

(vi) ORIGINAL SOURCE: 

(B) STOAIN: single 

(ix) FEATURE: 

(A) NAME/KEY: a>S 

(B) LOCAT ION: 357. .2039 

(D) OTBER INFORMATION: /label- ORFl 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2249. .3076 

(D) OTHER INFORMATIGN: /label- 0RF2 
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(ix) FEATORE: 

(A) NAME/KEY: CDS 

(B) l OCftT ION; 3166.. 4869 

(D) OZBER JNFOBMKriOS: /label"" aBF3 

(ix) FEATORE: 

(A) NAME/KEY: CDS 

(B) lOCATXOK: 4894.. 5985 

(D) OTHER INFCmSKnmi /label" aRF4 



(xi) 5EQ0ENCE DESCRIPTIC3N: SEQ ID MO: 1: 
GAATTCOGAC AAOGCCGAAG AAGOGOGQ^ OCGCTQVAAG AGGAGCAGGA ACTGGAGCAA 60 



ACGCTGTCCC AGGTGATCGA CAGCCTGCCA CTGCGCATCG 


AGGGCCGATG AACAGCATTG 


120 


GCAAAAGCTG GCGGTGOGCA GIGCGCGAGT GATCCGATCA 


TTTTIGATOG GCTCGCCIX^ 


180 


TCAAAATCGG CGGIGGATGA AGTCGACGGC GGACIGATCA 


GGOGCAAAAG AACATGOGOC 


240 


AAAAOCTTCT TTTATAQOa ATACCPTTQC ACTTCAGAAT 


GTTAATTCGG AAAOGGAATT 


300 


TGCATCGCTT TTCOGGCAGT CTAGAGICTC TAACAGCACA 


TTGRTGTGOC TCTTGC 


356 


ATG GAT GCA CGA AGA CTG GCG GCC TCC CCT CGT 
Met Asp Ala Axg Arg Leu Ala Ala Ser Pro Arg 


CAC AGG CGG COC GCC 
His Arg Arg Pro Ala 


404 



1 5 10 15 



TTTGACACAAGGAGTGTTATGAACAAGO^ATCAAGAATATCGTCATC 452 
Phe Asp Thr Arg Ser Val Met Asn Lys Pro lie Lys Asn He Val He 
20 25 30 

GTG GGC GGCGGTACTGCGGGCTGGATGGOCGCCTOGTACCTCGTC COG 500 
Val Gly Gly Gly Thr Ala Gly Trp Met Ala Ala Ser Tyr Leu Val Arg 
35 40 45 



GCCCTCCAACAGCAGGCGAACATTAOSCTCATCGAATCTGCGGOGATC 548 
Ala Leu Gin Gin Gin Ala Asn He Thr Leu He Glu Ser Ala Ala He 
50 55 60 ~ 

CCTCGGATCGGCGTGGGCGAAGOSACCATCCCAAGTTTGCAGAAGG^ 596 
Pro Arg He Gly Val Gly Glu Ala Thr He Pro Ser Leu Gin Lys Val 
65 70 75 80 

TTCTTCGATTTCCTCGGGA!mCCGGAGCGGGAATGGATGO0C^ 644 
Phe Phe Asp Phe Leu Gly He Pro Glu Arg Glu Tip Met Pro Gin Val 
85 90 95 

AACGGCGCGTTCAAGGO^GOGATCAAGTTCGTGAATTGGAGAAAGT^ 692 
Asn Gly Ala Phe Lys Ala Ala He Lys Phe Val Asn Trp Arg Lys Ser 
100 105 HO 

OCC GAG CCC TCG OGC GAC GAT CAC TTC TAC CAT TTG TTC GGC AAC GTG 740 
Pro Asp Pro Ser Arg Asp Asp His Phe Tyr His Leu Phe Gly Asn Val 
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115 120 125 

CCGi^TGCGACGGCGTGCOGCTTACCCACTACTGGCTO 788 
Pro Asn Cys Asp Gly Val. Pro Leu Thr His Tyr Trp Leu Arg Lys Arg 
130 135 140 

GAA CAG GGC TTC CAG CAG CCG ATO GAGTACGCGTGCTACCXXSCAGOCX: 836 
Glu Gin Gly Phe Gin Gin Pro Met Glu Tyr Ala Cys Tyr Pro Gin Pro 
145 150 155 160 

GGGGCACTCGACGGCAAGCTGGCAC0GTCCCTGTCX:G^ 884 
Gly Ala Leu Asp Gly Lys Leu Ala Pro Cys Leu Ser Asp Gly Thr T^rg 
165 170 175 

CAG ATG TCCCACQOGTGGCACTTCGACGOSCACCTGGTGGCCGAC^ 932 
Gin Ser His Ala Trp His Phe Asp Ala Bis Leu Val Ala Asp Phe 
180 185 190 

TTGAAGOGCTGGGCCGTCGAGa^CGGGGIGAACCGCGIGGICGATGA^ 980 
Leu Lys Arg Trp Ala Val Glu Arg Gly Val Asn Arg Val Val Asp Glu 
195 200 205 

GTG GIG GAC GTT CGC CIG. AACAACOGCGGCTACATCTCCAACCIGCTC 1028 
Val Val Asp Val Arg Leu Asn Asn Arg Gly 'Tyr lie Ser Asn Leu Leu 
210 215 220 

ACC AAG GAG GGG CGG ACG CIGGAGGOGGACCTGTTCATCGACTGCTCC 1076 
Thr Lys Glu Gly Arg Thr Leu Glu Ala Asp Leu Phe lie Asp Cys Ser 
225 230 235 240 

GGC ATG CGG GGG CIC CIG ATC AAT CAG GCG CTG AAG GAA CCC TTC ATC 1124 
Gly Met Arg Gly Leu Leu lie Asn Gin Ala Leu Lys Glu Pro Phe He 
245 250 255 

GACATGTCCGACTACCIGCIGTGCGACAGCGOGGICGCCAGCGCCG^ 1172 
Asp Met Ser Asp Tyr Leu Leu Cys Asp Ser Ala Val Ala Ser Ala Val 
260 265 270 

CCCAACGACGACGaSOGCGATGGGGTCGAGOaSTACACCTCCTOSA^ 1220 
Pro Asn Asp Asp Ala Arg Asp Gly Val Glu Pro 'Syr Thr Ser Ser He 
275 280 285 

GCCATGAACTCGGGATGGACCTGGAAGATTCCGATGCTGGGCOGGTTC 1268 
Ala Met Asn Ser Gly Trp Thr Trp Lys lie Pro Met Leu Gly Arg Phe 
290 295 300 

GGCAGCGGCTACGTCTTCTOGAGCCATTTCACCTCGOGCGACCAGGOC 1316 
Gly Ser Gly Tyr Val Phe Ser Ser His Phe Thr Ser Arg Asp Gin Ala 
305 310 315 320 

ACCGCCGACTTCCTCAAACTCTGGGGCCICTOSGACAATCAGCCGCrc 1364 
Thr Ala Asp Phe Leu Lys Leu Tzp Gly Leu Ser Asp Asn Gin Pro Leu 
325 330 335 

AACCAGATCAAGTTCCGGGTCGGGCGCAACAAGCGGGCGTGGGTCAAC 1412 
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Gin lie Lys Phe Axg Val Gly Arg Asn Lys Arg Ala Txp Val Asn 
340 345 350 

AAC TGC GTC TCG ATC GGG CTG TOG TOG TGC TTT CTG GAG CCC CTG GAA 1460 
Asn Cys Val Ser lie Gly Leu Ser Ser Cys Phe Leu Glu Pro Leu Glu 
355 360 365 

TCGAOGGGGATCTACTTOATCTACGOGGOSCTTTACCA^ 1508 
Ser Thr Gly lie Tyr Phe lie Tyr Ala Ala Leu Tyr Gin Leu Val Lys 
370 375 380 

CAC TTC CCC GAC ACC TCG TTC GAC COG CGG CTG AGO GAC GCT TTC AAC 1556 
His Phe Pro Asp Thr Ser Phe Asp Pro Arg Leu Ser Asp Ala Phe Asn 
385 390 395 400 

GCCGAGATCGTCCACATGTTCGACGACTGCaSGGATTTCGTC^ 1604 
Ala Glu lie Val His Met Phe Asp Asp Cys Arg Asp Phe Val Gin Ala 
405 410 415 

CAC TAT TTC ACC ACG TCG CGC GAT GAC ACG COG TTC TGG CTC GOG AAC 1652 
His Tyr Phe Thr Thr Ser Arg Asp Asp Thr Pro Phe Trp Leu Ala Asn 
420 425 430 

CGGCACGACCTGa3GC:3X:TCGGACGOCATCAAAGAGA^ 1700 
Arg His Asp Leu Arg Leu Ser Asp Ala He Lys Glu Lys Val Gin Arg 
435 440 445 

TAC AAG GOG GGG CTG COG CIG ACC AOC AOG TOG TTC GAC GKT TOO AOG 1748 
Tyr Lys Ala Gly Leu Pro Leu Thr Thr Thr Ser Phe Asp Asp Ser Thr 
450 455 460 

TAC TAC GAG ACC TTC GAC TAC GAA TTCAAGAATTTCTGGTTGAACGGC 1796 
Tyr Tyr Glu Thr Phe Asp Tyr Glu Phe Lys Asn Phe Trp Leu Asn Gly 
465 470 475 480 

AAC TAC TACTGCATCTTTGOCGGCTTGGGCATG GIG 000 GAC CGG TOG 1844 
Asn Tyr Tyr Cys lie Phe Ala Gly Leu Gly Met Leu Pro Asp Arg Ser 
485 490 495 

CTG CCG CIG TTG CAG CAC CGA COG GAG TCG ATC GAG AAA GOO GAG GOG 1892 
Leu Pro Leu Leu Gin His Arg Pro Glu Ser He Glu Lys Ala Glu Ala 
500 505 510 

ATGTTCGCCAGCATCOGGCGCGAGGCCGAGOGTCTGOGCADCAQCCTG 1940 
Met Phe Ala Ser He Arg Arg Glu Ala Glu Arg Leu Arg tOir Ser Leu 
515 520 525 

CCG ACA AAC TAC GAC TAC CTG CGG TOG CTG OGT GAC GGC GAC GOG GGG 1988 
Pro Thr Asn Tyr Asp Tyr Leu Arg Ser Leu Arg Asp Gly Asp Ala Gly 
530 535 540 

CTGTCGCGCQGCCAGCGTQGGOOGAAGCTCGCAGCGCAGGAA AGO CTG 2036 
Leu Ser Arg Gly Gin Arg Gly Pro Lys Leu Ala Ala Gin Glu Ser Leu 
545 550 555 560 
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TAGTGG?^AOG CACCTTGGAC CGGGIAGGCG TATTOGOGGC CACXXSmH! GCCGTGGOGG 2096 

OCIGOGATOC GCIGCAGGCX3 CGOGOGCICG TICIGCAACT GCOGGGCTTG AACOGIAACTl 2156 

AGGAOGIGOC CX^STATOGTC GGCXTTGCTGC (XXSMSTTCCT T00GGIGC3GC GGCXn!GOOCT 2216 

GCX3GCIGGG6 TTTOGTCGAA GCOQCCGCCG 06M!GO0GGACATCGGGTTCTTC 2269 

Met Azg Asp lie Gly Phe Phe 
1 5 

CTOGGGTCGCTCAAGCGCO^QGACATGJtfSC^ 2317 
Leu Gly Sex Leu Lys Azg His Gly His Glu Pro Ala Glu Val Val Pro 
10 15 20 

QGG CTT GAG COG GTG CTG CTC GAC CTG GCA OQC GCG ACC AAC CTG OOG 2365 
Gly Leu Glu Pro Val Leu Leu Asp Leu Ala Arg Ala Thr Asn Leu Pro 
25 30 35 

CCGOGCGAGAOGCTCCIGCATGIG AOG GIC TOG AAC COC ACG GCG GOC 2413 
Pro Arg Glu Thr Leu Leu His Val Thr Val Trp Asn Pro Thr Ala Ala 
40 45 50 55 

GAC GCG CAG OGC AGC TACACCGGGCTGCOCC^GAAGOGCACCIGCIC 2461 
Asp Ala Gin Arg Ser Tyr Thr Gly Leu Pro Asp Glu Ala His Leu Leu 
60 65 70 

GAG AGC GTG CGC ATC TOG ATG GOGGCCCTCCaGGCGGCCATCGCGTaXS 2509 
Glu Ser Val Arg He Ser Met Ala Ala Leu Glu Ala Ala lie Ala Leu 
75 80 85 

ACC GICGAGCIGTICGS^GIGTCCCIGCGGTOGCCCGA^ CAA 2557 

Thr Val Glu Leu Phe Asp Val Ser Leu Arg Ser Pro Glu Phe Ala Gin 
90 95 100 

AGGTGCGACGAGCTGGAAGOCTATCTGCAGAAAATGGTCGAATaSArc 2605 
Arg Cys Asp Glu Leu Glu Ala Tyr Leu Gin Lys Met Val Glu Ser lie 
105 110 115 

GTCTACGCGTACCGCTTCATCTTOCCGCAGGTCTTCTACGATG?^^ 2653 
Val Tyr Ala Tyr Arg Phe lie Ser Pro Gin Val Phe Tyr Asp Glu Leu 
120 125 130 135 

CGCCCCTTCTACGAACCGATTCGAGTCGQGGGCCAGAGCTACCrCGQC 2701 
Arg Pro Phe Tyr Glu Pro lie Arg Val Gly Gly Glh Ser Tyr Leu Gly 
140 145 150 

CCCGGTGCCGTAGAGATGCCCCTCTTCGIGCTGGAGCACGTCCrc 2749 
Pro Gly Ala Val Glu Met Pro Leu Phe Val Leu Glu His Val Leu Trp 
155 160 165 

GGCTOGCAATOSGACGACCAAACTTATOGAGAATTCAAAGAGAOG^ 2797 
Gly Ser Gin Ser Asp Asp Gin Thr Tyr Arg Glu Phe Lys Glu Thr "Tyr 
170 175 180 



CTGOCCTATGTGCrrCCCGCGTACAGGGOGGTCTACGCTOQGTTCTOC 



2845 
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Leu Pro Tyr Val Leu Pro Ala Ty^^ Arg Ala Val Oyr Ala Arg Phe Ser 
185 190 195 

GGG GAG CX:G GCG CIC ATC GAC CX3C GCG CTC GAC GAG GCG OGA QCG GTC 2893 
Gly Glu Pro Ala Leu He Asp Arg Ala Leu Asp Glu Ala Arg Ala Val 
200 205 210 215 

GGT ACX3 OGG GAC GAS CAC GTC OGG GCT GGG CTG ACA GCG CTC GAG OGG 2941 
Gly Thr Arg Asp Glu His Val Arg Ala Gly Leu Thr Ala Leu Glu Arg 
220 225 230 

GTC TTC AAG GTC CTG CIG OGC TTC OGG GCG OCT CAC CTC AAA TTG GOG 2989 
Val Phe Lys Val Leu Leu Arg Phe Arg Ala Pro His Leu Lys Leu Ala 
235 240 245 

GAGCGGGCGTACGAAGTCGGGCAAAGCGGCOOG AAA TOG GCA GCG GGG 3037 
Glu Arg Ala Tyr Glu Val Gly Gin Ser Gly Pro Lys Ser Ala Ala Gly 
250 255 260 

GGTACGCGCCCAGCATGCTCGGTGAGCTGCTCACGC TGAOGTATQC 3083 
Gly Thr Arg Pro Ala Cys Ser Val Ser Cys Ser Arg 
265 270 275 

CGCGCGGTCC OGCGTCCGCG COQCGCTOGA CGAATCCTGA TGOQOGOGAC OCAGIGTTKT 3143 

CTCACAAGGA GAGTTTGCCC CC ATG ACT CAG AAG AGC COC GCG AAC GAA CAC 3195 

Met Thr dn Lys Ser Pro Ala Asn Glu His 
1 5 10 

GAT AGC AAT CAC TTC GAC GTA ATC ATC CTC GGC TCG GGC ATG TOC QQC 3243 
Asp Ser Asn His Phe Asp Val lie He Leu Gly Ser Gly Met Ser Gly 
15 20 25 

ACC CAG ATG GGG GCC ATC TTG GCC AAA CAA CAG TTT GGC GTG CTG ATC 3291 
Thr Gin Met Gly Ala He Leu Ala Lys Gin Gin Phe Arg Val Leu He 
30 35 40 

ATCGAGGAGTOSTCGCACCCGaSGTTCAOGATCGQCGAAT^ 3339 
He Glu Glu Ser Ser His Pro Arg Phe Thr He Gly Glu Ser Ser He 
45 50 55 

CCC GAG ACG TCT CTT ATG AAC OGC ATC ATC GCT GAT OGC TAC GGC ATT 3387 
Pro Glu Thr Ser Leu Met Asn Arg He He Ala Asp Arg Tyr Gly He 
60 65 70 

CCGGAGGTCGACCACATCACGTCGTTTTATTOSAaSCAACGTTACG^ 3435 
Pro Glu Leu Asp His He Thr Ser Phe Tyr Ser Thr Gin Arg Tyr Val 
75 80 85 90 

GCG TCG AGC ACGGGCATTAAGCGCAACTTCGGCTTCGTCTTCCACAAG 3483 
Ala Ser Ser Thr Gly He Lys Arg Asn Phe Gly Phe Val Phe His Lys 
95 100 105 

CCC GGC CAG GAG CAC GAC COG AAG GAGTTCACCCAGTGCGTCATTCCC 3531 
Pro Gly Gin Glu His Ajsp Pro Lys Glu Phe Thr Gin Cys Val He Pro 
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110 



115 



120 



GAG CTG CCG TOG GOG CCG GAG AGC CAT TAT TAC CGG CAA GAC GTC GAC 3579 
Glu Leu Pro Trp Gly Pro Glu Ser His Tyr lyr Arg Gin Asp Val Asp 
125 130 135 

GOC TAC TEG TTG CAA GCCGOCATTAAATACGQCTQCAAGGTC CAC GAG 3627 
Ala Ty^: I^eu Leu Gin Ala Ala He Lys Tyr Gly Cys Lys Val His Gin 
140 145 150 

AAAACTACCGTGACCGAATACCACGOCGAT AAA GAC GGC GTC GOG GIG 3675 
Lys Thr Thr Val Thr Glu Tyr His Ala Asp Lys Asp Gly Val Ala Val 
155 160 165 170 

ACCACCGCCCAGGGCGAAOGGTTC ACC GGC OGG TAC ATG ATC (3^ TQC 3723 
Thr Thr Ala Gin Gly Glu Arg Phe Thr Gly Arg Tyr Met He Asp Cys 
175 180 185 



GGA GGA CCT CGC GCG CCG CTC GCG ACC AAG TTC AAG CtC CGC GAA GMi 
Gly Gly Pro Arg Ala Pro Leu Ala Thr Lys Phe Lys Leu Arg Glu Glu 
190 195 200 



3771 



CCG TGT CGC TTC AAG AOG CAC TCG CGC AGC CTC TAC AOG CAC AIG CTC 3819 
Pro C^s Arg Phe Lys Thr Bis Ser Arg Ser Leu Tyr Thr His Met Leu 
205 210 215 

GGG GTC AAG CCG TTC GAC GAC ATC TTC AAG GTC AAG GGG CAG CGC TGG 3867 
Gly Val Lys Pro Phe Asp Asp He Phe Lys Val Lys Gly Gin Arg Tip 
220 225 230 

CGC TGG CAC GAG GGG ACC TTG CAC CAC ATG TTC GAG GGC GGC TGG CTC 3915 
Arg Trp His Glu Gly Thr Leu His His Met Phe Glu Gly Gly Trp Leu 
235 240 245 250 

TGGGTGATTCOGTTCAACAACCACOOGCQGTOGAOCAAC AAC CTG GIG 3963 
Trp Val He Pro Phe Asn Asn His Pro Arg Ser Thr Asn Asn Leu Val 
255 260 265 

AGC GTC GGC CTG CAG CTC GAC COG OGT GTC TAC COG AAA ACC GAC ATC 4011 
Ser Val Gly Leu Gin Leu Asp Pro Arg Val Tyr Pro Lys Thr Asp He 
270 275 280 

TCC GCA CAG CAG GAA TTC GAT GAG TTC CTC GOG CGG TTC COG AGC ATC 4059 
Ser Ala Gin Gin Glu Phe Asp Glu Phe Leu Ala Arg Phe Pro Ser He 
285 290 295 

GGG GCT CAG TTC CGG GAC GCC GTG COG GTG CGC GAC TGG CTC AAG ACC 4107 
Gly Ala Gin Phe Arg Asp Ala Val Pro Val Arg Asp Trp Val Lys Thr 
300 305 310 

GAC CGC CTG CAA TTC TOG TOG AAC GOC TQC GTC GGC GRC CGC TAC TGC 4155 
Asp Arg Leu Gin Phe Ser Ser Asn Ala Cys Val Gly Asp Arg Tyr CSys 
315 320 325 330 



CTG ATG CTG CAC GOG AAC GGC TTC ATC GAC COG CTC TTC TCC COG GGG 



4203 
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leu Met Leu His Ala Asn Gly Phe lie Asp Pro Leu Phe Ser Arg Gly 
335 340 345 

CTGGAAAACACCGCGGTGACCATCCACGIXCrcGCGQ^ 
Leu Glu Asn Thr Ala Val Thr lie His Ala Leu Ala Ala Arg Leu lie 
350 355 360 



4251 



AAGG(XCTGOGCGACGACGACTTCTCCOa:GAGOGCa^ 
Lys Ala Leu Arg Asp Asp Asp Phe Ser Pro Glu Arg Phe Glu Tyr He 
365 370 375 



4299 



GAG CX^C CTG CAG CAA AAG CTT TTG GAC CAC AAC <aC GAC TTC GTC AGC 
Glu Arg Leu Gin Gin Lys Leu Leu Asp His Asn Asp Asp Phe Val Ser 
380 385 390 



4347 



TGC TCC TAC ACG GCG TTC TCG GAC TTC CGC CIA TGG GAC GCG TTC CAC 
Cys Cys Tyr Thr Ala Phe Ser Asp Phe Arg Leu Trp Asp Ala Phe His 
395 400 405 410 



4395 



AQGCTGTGGGCGGTCGGCACCATCCTCGGGCAGTTCCGGCTC GTG CAG 
Arg Leu Trp Ala Val Gly Thr He Leu Gly Gin Phe Arg Leu Val Gin 
415 420 425 



4443 



GCCCACGOGAGGTTCCGCGCGTCGOGCAAC GAG GGC GAC GTC GAT CAC 
Ala His Ala Arg Phe Arg Ala Ser Arg Asn Glu Gly Asp Leu Asp His 
430 435 440 



4491 



CTC GAC AAC GAC CCT CCG TAT CTC GGA TAC CTG TGC GCG GAC ATG GAG 
Leu Asp Asn Asp Pro Pro Tyr Leu Gly Tyr Leu ^s Ala Asp Met Glu 
445 450 455 



4539 



GAGTACTACCAGTTGTTCAACGACGCCAAAGCCGAGGTCGAGGCCGTG 
Glu Tyr Tyr Gin Leu Phe Asn Asp Ala Lys Ala Glu Val Glu Ala Val 
460 465 470 



4587 



AGTGCCGGGCGCAAGCOGGCCGATGAGGCCGOSGOGCGGATTCACGOC 
Ser Ala Gly Arg Lys Pro Ala Asp Glu Ala Ala Ala Arg lie His Ala 
475 480 485 490 



4635 



CTC ATT GACGAACGAGACTTCGCCAAGOOG ATG TTC GGC TTC GGG TAC 4683 
Leu lie Asp Glu Arg Asp Phe Ala Lys Pro Met Phe Gly Phe Gly Tyr 
495 500 505 

TGC ATC ACC GOG GAC AAG COGCAGCTCAACAACTOGAAGTACAGCCTG 4731 
cys He Thr Gly Asp Lys Pro Gin Leu Asn Asn Ser Lys Tyr Ser Leu 
510 515 520 



CTG CCG GCGATGOGGCTGATGTACTGGACGCAAACCCGCGOGOCGGCA 
Leu Pro Ala Met Arg Leu Met Tyr Trp Thr Gin Thr Arg Ala Pro Ala 
525 530 535 



4779 



GAG GTG AAA AAGTACTTCGACTACAACa^GATGTTCGCGCTGCTCAAG 
Glu Val Lys Lys Tyr Phe Asp Tyr Asn Pro Met Phe Ala Leu Leu Lys 
540 545 550 



4827 
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GOG TAG AIC AOG ACC OGC ATC GGC CTG GOG GIG AAG AAG TAGOOGCIOG 4876 
Ala Tyr He Thr Thr Arg He Gly Leu Ala Leu Lys Lys 
555 560 565 

AOGACGACAT AAAAAOG ATG MC ATT CAA TTG GAT CAA GOG AGO GIG 4926 
Met Asn Asp He Gin Leu Asp Gin Ala Ser Val 
15 10 

AAGAAGOGrOOGTCGGGOGOGTAGGAGGCAAOCAOGCGC GIG GOG GOG 4974 
Lys Lys Arg Pro Ser Gly Ala Tyr Asp Ala Thr Uir Arg Leu Ala Ala 
15 20 25 

AGO TGGTAGGIGGOGATGGGGTOCAAGGAGCIGAAGGAGAAGOOGAOG 5022 
Ser Trp Tyr Val Ala Met Arg Ser Asn Glu Leu Lys Asp Lys Pro Thr 
30 35 40 

GAG TTG AOG CTCTTCGGOGGTOOGTGCGTGGCGTGGOGCGGAGOC AOG 5070 
Glu Leu Thr Leu Phe Gly Arg Pro Cys Val Ala Trp Arg Gly Ala Thr 
45 50 55 

GGG OGG GOG GIG GIG ATG GAG OGG GAG TGG TOG GAG GTG GGG GOG AAG 5118 
Gly Arg Ala Val Val Met Asp Arg His Cys Ser His Leu Gly Ala Asn 
60 65 70 . 75 

CIGGCIGAGGGGCGGATGAAGQ^GGGGTGGATCCAGTGOCOGTTTC^ 5166 
Leu Ala Asp Gly Arg He Lys Asp Gly Cys He Gin Cys Pro Phe His 
80 85 90 

GAG TGG OGG TAG GAG GAA GAG GGG GAG TGG GTT GAG ATG GOG GGG GAT 5214 
His Trp Arg Tyr Asp Glu Gin Gly Gin Cys Val His He Pro Gly His 
95 100 105 

AAG GAG GGG GTG OGG GAG GIG GAGOOGGTGOOGOGGGGGGOGOGTCAG 5262 
Asn Gin Ala Val Arg Gin Leu Glu Pro Val Pro Arg Gly Ala Arg Gin 
110 115 120 

GOGAaSTTGGTGAGCGOGGAGO^TAGGGCTAGGTGTGGGTGTGGT^ 5310 
Pro Thr Leu Val Thr Ala Glu Arg Tyr Gly Tyr Val Trp Val Trp Tyr 
125 130 135 

GGG TOG GOG CTG COG CIG GAGOOGCIGOOGGAAATCTOGGOGGOGGRT 5358 
Gly Ser Pro Leu Pro Leu His Pro Leu Pro Glu He Ser Ala Ala Asp 
140 145 150 155 

GIGGAGAAGGGOGAGTTTATGCAGCTGCAGTIGGOGTTCGAGAaS 5406 
Val Asp Asn Gly Asp Phe Met His Leu His Phe Ala Phe Glu Thr Thr 
160 165 170 

ACG GGG GTG TTG GGG ATG GTG GAG AAG TTG TAG GAG GOG GAG GAG GGA 5454 
Thr Ala Val Leu Arg He Val Glu Asn Phe Tyr Asp Ala Gin His Ala 
175 180 185 

ACCCOGGTGCAGGGACTCOOGAICTaSGGGTTCGAACTCAAGCTGTO 5502 
Thr Pro Val His Ala Leu Pro He Ser Ala Phe Glu Leu Lys Leu Phe 
190 195 200 
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GAC GAT TGG OGC CAG TGG GOG GAG GTT GAG TOG CTG GOO CIG GOG GGC 5550 
Asp Asp Trp Arg Gin Trp Pro Glu Val Glu Ser Leu Ala Leu Ala Gly 
205 210 215 

GCG TGG TTC GGT GOO GGG ATO GAG TTC AGO GTG GAG OGG TAG TTC GGG 5598 
Ala Trp Phe Gly Ala Gly He Asp Phe Thr Val Asp Arg Tyr Phe Gly 
220 225 230 235 

CCC CTC GGC ATG CTG TCA OGCGOGCTCGGCCTGAAGATGTCGGAG ATG 5646 
Pro Leu Gly Met Leu Ser Arg Ala Leu Gly Leu Asn Met Ser Gin Met 
240 245 250 

AAGGTGCACTTCGATGGCTAGCCCGGCGGGTGCGIC ATG AOC GTC GOO 5694 
Asn Leu His Phe Asp Gly Tyr Pro Gly Gly Cys Val Met Thr Val Ala 
255 260 265 

CTG GAG GGA GAC GTC AAA TACAAGCTGCTGCAGTGTGIGAOG GCG GTG 5742 
Leu Asp Gly Asp Val Lys Tyr Lys Leu Leu Gin Cys Val Thr Pro Val 
270 275 280 

AGC GAA GGC AAG AAC GTC ATG CAG ATG CTC ATG TCG ATG AAG AAG GTG 5790 
Ser Glu Gly Lys Asn Val Met fidLs Met Leu lie Ser lie Lys I(/s Val 
285 290 295 

GGC GGC ATG CTG GTC GGG GOG AOC GAG TTC GTG CTG TTC GGG CTG CAG 5838 
Gly Gly He Leu Leu Arg Ala Thr Asp Phe Val Leu Phe Gly Leu Gin 
300 305 310 315 

AGCAGGCAGGCCGCGGGGTAGGAGGTG AAA ATG TGG AAG GGA ATG AAG 5886 
Thr Arg Gln^Ala Ala Gly Tyr Asp Val Lys lie Trp Asn Gly Met Lys 
320 325 330 

a^GQ^GGCGGGGGGGOGTACAGOAAGTACGACAAGCrcG^ 5934 
Pro Asp Gly Gly Gly Ala Tyr Ser Lys Tyr Asp Lys Leu Val Leu Lys 
335 340 345 



TAGCGGGGGTTGTATGGAGGGTGGGTCGAGCGGGTGGCAAGTG^CGG 5982 
Tyr Arg Ala Phe Tyr Arg Gly Trp Val Asp Arg Val Ala Ser Glu Arg 



350 


355 




3€0 






TQOIGCGTGA 


AGOOGAGCOG 


CTCTCGAOOG 


OGTOGOTGOG 


OCAGGOGCTG 


GOGAAOCTGG 


6042 


CGAGOGGGGT 


GAGGATCAGG 


GOOTAGGGOG 




GCTTGGGCTC 


GOGGOOAOCA 


6102 


GCTTGGTGTG 


GGAGTCGCTG 


OTTGOGAGGT 


ATTCATGACT 


AICTGGCTGT 


TGCAACTOGT 


6162 


GCTGGTGATC 


GCGCTCTGCA 


MGTCTGOGG 


GGGGATTGCC 


GAAGGGCTCG 


GGGAGTGOGC 


6222 


GGTGATCGGC 


GAGATCGOGG 




GTTGGGGCCG 


TCGGTGTTCG 


GOGTGATOGC 


6282 


AGOGAGTTTG 


TACGAGCTGT 


T0?nn3GOOG 


OGAGGTGCTG 


TCAGOGATGG 


OGCAAGTECAG 


6342 



CGAAGTOGGG CTGGIACTGG TGATGTTOCA GGTCGGOCTG CAXATGGAGT TGGGOGAGAC 6402 
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GCIX3CG0GAC AAOOGCIGGC GCA3X3CXX3ST CX30GAT0GQV G0GGGCX3GGC TOGIOGCACX: 6462 

GGCCGOGATC GGCKCGSKKSS TOGOCAIIQGT TTOGAAAGGC iyXCIOGOQV GOGAOGOGOC 6522 

GQCGCIGCXX: TATOTQCTCT TCTQOQGTGT CGOCTIXSCG G33VTOQGOQG TOCJOQGTGftT 6582 

GGCGOGOm: iviosycxsux roraGCICAG OGGSCAIIGGIG GGOGOGOGGC AOGCMSGTC 6642 

TGCCGCSmS CTGACQCSmS OGCTOQGaaXS GMCGCIQCTT GCMCXSKSTTG OCTOQCTflTC 6702 

GRGOQQGCCC GQCTGQQCM TIGCQOGCAir QCTOSTCaGC CIGCTCQCGT MCTGGTGCT 6762 

GIGOGCX^CIG CnX^GrCGOGCT TCGIGGTTOG AOOSAOOCIT G0G0GGCT06 OGIOGAOOGC 6822 

GCAXGOraCG CX30GAOa3CT TOGCX3GIGTT GTICIGCTTC GTAATGXTGT GGGCACTOGC 6882 

GAOGanXriG MOGGftTTOC MMSGQCTTT TOQCXXaCTT GOOQOGGOQC TCTTCGXGOG 6942 

OOGGGIGCXX: GGCGICGCGk AGGAGTGGC6 CG^iCM^CGTC QkAGGTTTOG TCAAGOn: 7000 

(2) imsBMKnm for seq id ND: 2: 

(i) SBQOEtCE CHARACTERISTICS: 

(A) LENGTH: 560 amino adds 

(B) TYPE: axoino acid 
(D) TOtPQIOG^: Jiw^r 

(ii) MDLECOLB TXPE: protein 

(xi) SEQUENCE DESCRIPTIQN: SEQ ID ND: 2: 

Met Asp Ala Axg Arg Leu Ala Ala Ser Faro Arg His Azg Axg Pro Ala 
1 5 10 15 

Phe A^ Thr Arg Ser Val Met Asn Lys Pro He Lys Asn lie Val He 
20 25 30 

Val Gly Gly Gly thr Ala GlyT^pMstiOaiaASertyrLeuValArg 
35 40 45 

Ala Leu Gin Gin Gin Ala Asn lie thr lea lie Gbi Ser Ala Ala He 
50 55 60 

Pro Arg lie Gly Val Gly Glu Ala Thr lie Pro Ser Leu Gin Lys Val 
65 70 75 80 

Phe Phe Asp Phe Leu Gly lie Pro Glu Arg Glu Trp Met Pro Gin Val 
85 90 95 

Aim Gly Ala Phe Lys Ala Ala He lys Phe Vial Asn Trp Arg lys Ser 
100 105 110 

Pro Asp Pro Ser Arg Asp A^ His Phe lyr Bis Leu Phe Gly Asn Val 
115 120 125 



wo 95/33818 



PCT/IB95/00414 



-134- 



Pxo Asn Cys Asp Gly Val Pro Leu Thr His Tyr Trp Leu Arg Lys Arg 
130 135 140 

Glu Gin Gly Phe Gin Gin Pro Met Glu Tyr Ala Cys Tyr Pro Gin Pro 
145 150 155 160 

Gly Ala Leu Asp Gly Lys Leu Ala Pro Cys Leu Ser Asp Gly Thr As^g 
165 170 175 

Gin Met Ser His Ala Trp His Phe Asp Ala His Leu Val Ala Asp Phe 
180 185 190 

Leu Lys Arg Trp Ala Val Glu Arg Gly Val Asn Arg Val Val Asp Glu 
195 200 205 

Val Val Asp Val Arg Leu Asn Asn Arg Gly Tyr He Ser Asn Leu Leu 
210 215 220 

Thr Lys Glu Gly Arg Thr Leu Glu Ala Asp Leu Phe lie Asp Cys Ser 
225 230 235 240 

Gly Met Arg Gly Leu Leu lie Asn Gin Ala Leu Lys Glu Pro Phe lie 
245 250 255 ' 

Asp Met Ser Asp Tyr Leu Leu Cys Asp Ser Ala Val Ala Ser Ala Val 
260 265 270 

Pro Asn Asp Asp Ala Arg Asp Gly Val Glu Pro Tyr Thr Ser Ser lie 
275 280 285 

Ala Met Asn Ser Gly Trp Thr Trp Lys He Pro Met Leu Gly Arg Phe 
290 295 300 

Gly Ser Gly Tyr Val Phe Ser Ser His Phe Thr Ser Arg Asp Gin Ala 
305 310 315 320 

Thr Ala Asp Phe Leu Lys Leu Trp Gly Leu Ser Asp Asn Gin Pro Leu 
325 330 335 

Asn Gin He Lys Phe Arg Val Gly Arg Asn Lys Arg Ala Tip Val Asn 
340 345 350 

Asn Cys Val Ser He Gly Leu Ser Ser Phe Leu Glu Pro Leu Glu 
355 360 365 

Ser Thr Gly He Tyr Phe He Tyr Ala Ala Leu Ty^: Gin Leu Val Lys 
370 375 380 

His Phe Pro Asp Thr Ser Phe Asp Pro Arg Leu Ser Asp Ala Phe Asn 
385 390 395 400 



Ala Glu He Val His Met Phe Asp Asp Cys Arg Asp Phe Val Gin Ala 
405 410 415 
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His 1!yr Phe Thr Thr Ser Arg Asp Asp Thr Pro Phe Trp Leu Ala Asn 
420 425 430 

Azg His Asp Leu Arg Leu Ser Asp Ala He Lys Glu Lys Val Gin Arg 
435 440 445 

Tyr Lys Ala Gly Leu Pro Leu a!hr !Ihr llhr Ser Phe Asp Asp Ser Thr 
450 455 460 

Tyr Tyr Glu Thr Phe Asp Tyr Glu Phe Lys Asn Phe Trp Leu Asn Gly 
465 470 475 480 

Asn Tyr Tyr Cys He Phe Ala Gly Leu Gly Met Leu Pro Asp Arg Ser 
485 490 495 

Leu Pro Leu Leu Gin His Arg Pro Glu Ser He Glu Lys Ala Glu Ala 
500 505 510 

Met Phe Ala Ser He Arg Arg Glu Ala Glu Arg Leu Arg Thr Ser Leu 
515 520 525 

Pro Thr Asn Tyr Asp Tyr Leu Arg Ser Leu Arg Asp Gly Asp Ala Gly 
530 535 ^ 540 

Leu Ser Arg Gly Gin Arg Gly Pro Lys Leu Ala Ala Gin Glu Ser Leu 
545 550 555 560 



(2) INFOSMKHDHf FOR SDQ ID ND: 3: 

(i) SB(?]E3TC£ CHARACTERISTICS: 

(A) LE3IGTH: 275 amino adds 

(B) TCPEi amino acid 
(D) TOPOLOGY: linear 

(ii) M3i£CQI£ TXP£: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Arg Asp He Gly Phe Phe Leu Gly Ser Leu Lys Arg His Gly His 
1 5 10 15 

Glu Pro Ala Glu Val Val Pro Gly Leu Glu Pro Val Leu Leu Asp Leu 
20 25 30 

Ala Arg Ala Thr Asn Leu Pro Pro Arg Glu Thr Leu Leu His Val Thr 
35 40 45 

Val Tzp Asn Pro Thr Ala Ala Asp AlaGlnArgSerTyrThrGlyLeu 
50 55 60 

Pro Asp Glu Ala Bis Leu Leu Glu Ser Val Arg lie Ser Met Ala Ala 
65 70 75 80 

Leu Glu Ala Ala He Ala Leu Thr Val Glu Leu Phe Asp Val Ser Leu 
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85 90 95 

Arg Ser Pro Glu Phe Ala Gin Arg Asp Glu Leu Glu Ala 'Syr Leu 
100 105 110 

Gin Lys Met Val Glu Ser lie Val Tyr Ala Tyr Arg Phe He Ser Pro 
115 120 125 

Gin Val Phe Tyr Asp Glu Leu Arg Pro Phe Tyr Glu Pro He Arg Val 
130 135 140 

Gly Gly Gin Ser Tyr Leu Gly Pro Gly Ala Val Glu Met Pro Leu Phe 
145 . 150 155 160 

Val Leu Glu His Val Leu Trp Gly Ser Gin Ser Asp Asp Gin Thr Tyr 
165 170 175 

Arg Glu Phe Lys Glu Thr Tyr Leu Pro Tyr Val Leu Pro Ala Tyr Arg 
180 185 190 

Ala Val Tyr Ala Arg Phe Ser Gly Glu Pro Ala Leu He Asp Arg Ala 
195 200 205 

I«u Asp Glu Ala Arg Ala Val Gly Thr Arg Asp Glu His Val Arg Ala 
210 215 220 

Gly Leu Thr Ala Leu Glu Arg Val Phe Lys Val Leu Leu Arg Phe Arg 
225 230 235 240 

Ala Pro His Leu Lys Leu 2Vla Glu Arg Ala Tyr Glu Val GLy Gin Ser 
245 250 255 

Gly Pro Lys Ser Ala Ala Gly Gly Thr Arg Pro Ala Cys Ser Val Ser 
260 265 270 

cys Ser Arg 
275 



(2) INFORMATION FOR SBQ ID NO: 4: 

(i) S&QOHNCE CHARACTERISTICS: 

(A) LEHaTH: 567 amino acids 

(B) TYPE: amino acdd 
(D) TOPOLOGY: linear 

(ii) MOI£nJ[£ TYPE: protein 

(xi) SEQUENCE DESdOPTION: SEQ ID NO: 4: 

Met Thr Gin Lys Ser Pro Ala Asn Glu His Asp Ser Asn Bis Phe Asp 
15 10 15 

Val He He Leu Gly Ser Gly Met Ser dy Thr Gin Met Gly Ala He 
20 25 30 
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Leu Ala Lys Gin Gin Phe Arg Val Leu He He Glu Glu Ser Ser His 
35 40 45 

Pro Arg Phe Thr He Gly Glu Ser Ser He Pro Glu Thr Ser Leu Met 
50 55 60 

Asn Arg He He Ala Asp Arg lyr Gly He Pro Glu Leu Asp Bis He 
65 * 70 75 80 

Thr Ser Phe Tyr Ser Thr Gin Arg Tyr Val Ala Ser Ser Thr Gly He 
85 90 95 

Lys Arg Asn Phe Gly Phe Val Phe His Lys Pro Gly Gin Glu His Asp 
100 105 110 

Pro Lys Glu Phe Thr Gin Cys Val He Pro Glu Leu Pro Tzp Gly Pro 
115 120 125 

Glu Ser His Tyr Tyr Arg Gin Asp Val Asp Ala Tyr Leu Leu Gin Ala 
130 135 140 

Ala He Lys Tyr Gly Cys Lys Val His Gin Lys Thr Thr Val Thr Glu 
145 150 155 160 

Tyr His Ala Asp Lys Asp Gly Val Ala Val Thr Thr Ala Gin Gly Glu 
165 170 175 

Arg Phe Thr Gly Arg Tyr Met He Asp Cys Gly Gly Pro Azg Ala Pro 
180 185 190 

Leu Ala Thr Lys Phe Lys Leu Arg Glu Glu Pro Cys Arg Phe Lys Thr 
195 200 205 

His Ser Arg Ser Leu Tyr Thr His Met Leu Gly Val Lys Pro Phe Asp 
210 215 220 

Asp He Phe Lys Val Lys Gly Gin Arg Tzp Arg Trp His Glu Gly Thr 
225 230 235 240 

Leu His His Met Phe Glu Gly Gly Trp Leu Tzp Val He Pro Phe Asn 
245 250 255 

Asn His Pro Arg Ser Thr Asn Asn Leu Val Ser Val Gly Leu Gin Leu 
260 265 270 

Asp Pro Arg Val Tyr Pro Lys Thr Asp He Ser Ala Gin Gin Glu Phe 
275 280 285 

2^ Glu Phe Leu Ala Azg Phe Pro Ser He Gly Ala Gin Phe Arg Asp 
290 295 300 



Ala Val Pro Val Arg Asp Tzp Val Lys Thr Asp Arg Leu Gin Phe Ser 
305 310 315 320 
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Ser Asn Ala Cys Val Gly Asp Arg Tyr Leu Met Leu His Ala Asn 
325 330 335 

Gly Phe He Asp Pro Leu Phe Ser Arg Gly Leu Glu Asn Thr Ala Val 
340 345 350 

Thr He His Ala Leu Ala Ala Arg Leu He Lys Ala Leu Arg Asp Asp 
355 360 365 

Asp Phe Ser Pro Glu Arg Phe Glu Tyr He Glu Arg Leu Gin Gin Lys 
370 375 380 

Leu teu Asp His Asn Asp Asp Phe Val Ser Cys O^s 'Syr TChr Ala Phe 
385 390 395 400 

Ser Asp Phe Arg Leu Trp Asp Ala Phe His Arg Leu Txp Ala Val Gly 
405 410 415 

Thr He Leu Gly Gin Phe Arg Leu Val Gin Ala His Ala Arg Phe Arg 
420 425 430 

Ala Ser Arg Asn Glu Gly Asp Leu Asp His Leu Asp Asn Asp Pro Pro 
435 440 445 

Tyr Leu Gly Tyr Leu Cys Ala Asp Met Glu Glu Tyr Tyr Gin Leu Phe 
450 455 460 

Asn Asp Ala Lys Ala Glu Val Glu Ala Val Ser Ala Gly Azg Lys Pro 
465 470 475 480 

Ala Asp Glu Ala Ala Ala Arg He His Ala Leu He A^ Glu Arg Asp 
485 490 495 

Phe Ala Lys Pro Met Phe Gly Phe Gly Tyr Cys He Thr Gly Asp Lys 
500 505 510 

Pro Gin Leu Asn Asn Ser Lys Tyr Ser Leu Leu Pro Ala Met Arg Leu 
515 520 525 

Met a!yr Trp Thr Gin Thr Arg Ala Pro Ala Glu Val Lys Lys Tyr Phe 
530 535 540 

Asp Tyr Asn Pro Met Phe Ala Leu Leu Lys Ala Tyr He Thr Thr Arg 
545 550 555 560 

He Gly Leu Ala Leu Lys Lys 
565 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SBQUENCE CHARACTERISTICS: 

(A) LE29GTH: 363 amino acids 

(B) TYPE: amino acid 
(D) TCPGLOG^: linear 
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(ii) MDLECDLE TXPE: protein 

(xi) SEQUENCE DESCRIPTIC^: SEQ ID NO: 5: 

Met Asn Asp He Gin Leu Asp Gin Ala Ser Val Lys Lys Arg Pro Ser 
15 10 15 

Gly Ala Tyr Asp Ala Thr Thr Arg Leu Ala Ala Ser Trp Tyr Val Ala 
20 25 30 

Met Arg Ser Asn Glu Leu Lys Asp Lys Pro Thr Glu Leu Thr Leu Phe 
35 40 45 

Gly Arg Pro Cys Val Ala Trp Arg Gly Ala Thr Gly Arg Ala Val Val 
50 55 60 

Met Asp Arg His Cys Ser His Leu Gly Ala Asn Leu Ala Asp Gly Arg 
65 70 75 80 

He Lys Asp Gly Cys He Gin Cys Pro Phe His His Trp Arg Tyr Asp 
85 90 95 

Glu Gin Gly Gin Cys Val His lie Pro Gly His Asn Gin Ala Val Arg 
100 105 110 

Gin Leu Glu Pro Val Pro Arg Gly Ala Arg Gin Pro Thr Leu Val Thr 
115 120 125 

Ala Glu Arg Tyr Gly Tyr Val Trp Val Trp Tyr Gly Ser Pro Leu Pro 
130 135 140 

Leu His Pro Leu Pro Glu He Ser Ala Ala Asp Val Asp Asn Gly Asp 
145 150 155 160 

Phe Met His Leu His Phe Ala Phe Glu Thr Thr Thr Ala Val Leu Arg 
165 170 175 

He Val Glu Asn Phe Tyr Asp Ala Gin His Ala Thr Pro Val His Ala 
180 185 190 

Leu Pro He Ser Ala Phe Glu Leu Lys Leu Phe Asp Asp Trp Arg Gin 
195 200 205 

Tip Pro Glu Val Glu Ser Leu Ala Leu Ala Gly Ala Trp Phe Gly Ala 
210 215 220 

Gly He Asp Phe Thr Val Asp Arg Tyr Phe Gly Pro Leu Gly Met Leu 
225 230 235 240 

Ser Arg Ala Leu Gly Leu Asn Met Ser Gin Met Asn Leu His Phe Asp 
245 250 255 

Gly Tyr Pro Gly Gly Cys Val Met Thr Val Ala Leu Asp Gly Asp Val 
260 265 270 
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Lys !lyr Lys Leu Leu Gin Cys Val !rhr Pro Val Ser Glu Gly Lys Asn 
275 280 285 

Val Met His Met Leu lie Ser lie Lys Lys Val Gly Gly lie I^u Leu 
290 295 300 

Arg Ala Thr Asp Phe Val Leu Phe Gly Leu Gin Thr Arg Gin Ala Ala 
305 310 315 320 

Gly n^yr Ksp Val Lys lie Trp Asn Gly Met Lys Pro Asp Gly Gly Gly 
325 330 335 

Ala Tyx Ser Lys Tyr Asp Lys Leu Val Leu Lys Tyr Arg Ala Phe Tyr 
340 345 350 

Arg Gly Trp Val Asp Arg Val Ala Ser Glu Arg 
355 360 

(2) INEOBMATIC^ FOR SBQ ID NO: 6: 

(i) S£QUE3^CE CHAE^ACIERISTICS: 

(A) LENGTH: 28958 base padLrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECOLB TYPE: DNA (genoRiic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SE2^: £10 

<xi) SBQPEHCE DESCRIPTION: SBQ ID NO: 6: 

aSATO^OGIC GG0Cn!aSACA COG!I^^ 60 

CTCTCAAGGC AOCftTTCICA TCGWaiCTC OGICOaOOC AIG®OGRGG CQOGftOGMG 120 

GTCGCTCTOC CTCCAIGGOC GGAC06AGGA CGCTCCTCAG GAOGOCOCTT GGAOGCGOCA 180 

CQOGAGOGGG TOQCTOQCIA AAQCTGOOCC CTOOCTCTOC TTCGMCTTC AOaATGQGC 240 

TOCTOCQGQG GQCACGOOQG TGmCAOOCA AQQCTCTTAC QCAQQCCIOG AAflGOQQQQG 300 

GCTOGOCTAT GQGCXirECAGr TCCAQQGACT TOQCTOOGTC TQGAAQOQOG QOGftDGAQCT 360 

CTTOGCOGRG GOCAAGCTOC OGmOGCAGG OGOCflAGGMT GOOGCIOGGT TOgOOCTCCA 420 

CCCCGOCCTG TTOGACAGOG COCTGCAOGC GCTTGTOCrT GAAGAOGAQC QGROQOOGGG 480 

CGTOQCTCIG OOCTTCIOGT GGAGftQGAGT CTOGCIQOGC TCOGTOGQOG OCADCAOOCT 540 

GOGCCTGCQC TT0CA30GTC CGAATQQCAA GTOCICOGTS TOQCTOCTOC TCGQCGADGC 600 
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OGCAQGCGftG CXXXITOGCCT OGGTCCAAGC GCTOGCCAOG OSaVICAOGT CCCAGGAGCA 660 

GCTCCGCAOC CAGGSAGCTT OOCTCCAOG^ TGCICICTTC OGGGTTGICT GGAQUSCECT 720 

GOOCAGCXXTT AOGiaSCICT CIGAGGOOOC GAAGGGIGTC CSCCTtiGMA CAOGGGGICT 780 

OGAOCIOGOG CTGCAGGOGT CICIOGOCXS CXAOO^OGGT CIOGCIGCXX: roOGSAGOQC 840 

GCTOGAOCAA GQOQCTTOQC CT00QGQC3CT 0GTCX5T0GTC OOCTTCAJIOG raT0QCX3CIC 900 

TGGCGACCTC AIAGAGAGCG CTCACAACTC CaCOGOGOGC GCCCTOGOCT aX3CTQC3^AGC 960 

GTGGCTTGAC GAOSUVOGCC TOGOCTCCTC GCQOCTOSTC CIQCTCAOCX: GACRQQCXaT 1020 

CXXMOOCaC CCaSRCGMX AOGTOCIOa OCTCOCTCAC QCTCCICICT QQQQCnXCT 1080 

GOGC3VCX3QOG CAAAQOSUWC AXCQGAGCT CXXn?CTCTTC CTOSIOGftOC TQGAOCICQG 1140 

DCAGGOCICG GAGOGOGCXX: TOCTOGGOGC GCTCGACACIl GSUSVSOGIC iySCiaSCTECT 1200 

OOGOCarOGA AAaiGOCrCG aXXXS^OGir OGlXaATGCA OGCTOSRCAG AGQOQCTCM 1260 

CGCGCCGAAC GI3VTCC3ia3T GGAGCCTTCA OOTCXXXaJ^CC AAAGQCACCT TOGACTOQCT 1320 

OGCOCICGIC GAOGCIOCIC TAGOOOGIGC G000CIOGC21 CAAGGCX3AG TCOGOGICX^C 1380 

0GIGCA0G06 GCAOGICICA ACTTCXSGOGTV TGIOCICMC AOOCTTGGCA TGCTTOOGQV 1440 

CAAOGOGGGG CXXXriCQQCG GOGAAGGOGC GGGCATTGIC AOOQAGIOG GCTCAGGIGT 1500 

TTC00GA3AC ACIGZAOGOG AOCGGGIGAT GGGCA3CTTC OGOGCSyGGCT TIGGCXXTAC 1560 

QGTCX3T0GCC QmXXXX3l TCATCTGOCX: CSVTOCXDOGaT GCXrrGGI OC T TOGTCCAAGC 1620 

OGCCAGOGTC COOGTOGICT TECTCAOOGC CIACSKSGGk. CIGGIOGAOXs TOGGGGTOCT 1680 

CAAGOXAAT CAAOGTGICC !£CA3Kr»IGC GGCXS3GGC GGC3GI 0 GGm CX G (XX3CX3&T 1740 

OCAGCIOGCX; 0GCXACX7E0G G0GCX:GAAST CTTOnCACC GCX3O!0CAS GSAAGIGGG2^ 1800 

dscrcsaosc gogcicggct togaogaigc gcacciogog rocrcftos i G lOTSGGMm i860 

OCSU^CAGCAT TTOCIGCX3CT GOOOSAGG GOGOGGCAaiG G»IGTOGTCX: TCAAOGOCTT 1920 

QGOGCGOGAG TTOGTOGACG CTrOGCTGCG TCTCCTQCXX5 AG0GGTCC3A GCTTOXSTCXSA 1980 

GA^rOSGCAAG AOGGATftTCX: GOQU^OOOGA OGCX^GIEAOGC CTOGOCTAOC COGGOGTOGT 2040 

TXAOOGOGOC TSCGKrCICI TGGAGGCIGG laXGKTCGk ATZCAAGAGA T0CI0GCAG21 2100 

GCIGCTOGAC CIGITCGAGC GOGGOGTOCT TCGIOOGCX^G CTCKICAOGr OCIGGGACAT 2160 

CXX3GCATGCX: CXXX^AQGCXST OXX^GOGOGCI OGCICAGGOG OGGCATATIG GAAAGTIOGT 2220 
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CCTCACCGTT OOCGTCXXaT OGATCOCOSA AGQCAOCMC CTCX31CACGG GAGQCftCCGG 2280 

CAOGCTOGGC G0GCICA3X:G CXSCX^OCACCT OaTOGOCAAT 0GCX3G0GACA A5CA0CIGCT 2340 

CCTCACCIOG CGMAGGGTG OGAGOGCICC GGGGGCXXSAG GCATTGCGC^ GOGAGCIOGA 2400 

A5CTCTGGGG GCIGOGGTCA OGCTCGCOCX; GIGOGAOGOG GCX^GATCXAC GOGOGCTCC?^ 2460 

AGCXXrrCTTG GRCAGCATCC CQ^GCGCTCA CCCGCTCACG GCCGTCGTGC ACQCOQOOGG 2520 

CGCCCTTGAC GA3X3GGCTGA TCAGCXSACAT GAGOOOOGAG CGCATCXSACC GOGTCTTIGC 2580 

TCCCAAGCTC GACGCXX^CTT GGCACTTGC^V TCAGCICACX: CAGGACAAGG OOGCIOSGGG 2640 

CTTCGTCrrC TTCTOGTOCXS C3CIO0GG0GT CCTOGGOGGT ATQGGICAAT CCAACIAOQC 2700 

GGGGGGCAAT GOGrTOCnG AOGOGCTOGC GCA!ICACX:GA 0GCX3T0CA!rG GGCTCCOGG 2760 

CTCCTCX3CTC GCATGGGGCC OTTOGGCOSA GCGCAGCGGA AIGACCCGAC AACCTCAGCX3 2820 

GCGTCGAZAC OGCTOGCATC AGGOGOGCGG ax:TCOGATCX: AIOGOCTOGG AOGAGGGTCT 2880 

OGaXrrCITC GATATGGOGC TOGGGOGCXX: GGAGOOOGOG CIGGTCCXX^G OXX^CTIOGA 2940 

CA!rGAAOGOG CI0GG0GCX3V AGGOCXSAOGG GCXAOOCIOG AIGTIOCAGG GICI0GI0C3G 3000 

CGCICGOGIC GOGOGCAAGG T0G0CAGCA21 TAATGCXTIG GOGGOGIOGC TCAOOCAGOG 3060 

CXTTOGCXmCC CrcOOGOCTA C30GAO0G0GA GCX3CATGCIG CIX^GASCCTCG TOOGCX^CXXaA 3120 

AGCCGCCATC GTCCTCGGCC TaSCCTCGTT CGAATOGCIC GATCCCOGTC GCCCTCTTCA 3180 

AGAGCTCGGT CTOSATTOCX: TCA3!GGCX:AT CGAGCIOOG& AATOGACICG CCGCCGCOC 3240 

AGGCTTGOG/l CICCAAGOCA (TCTOCmCIT OGACCAOOOG A0G00CX30CG OGCIOGQGAC 3300 

CXnXSCTGCT C GGGAAGCICC TCCAGCAIQV AGCIGCX^GAT OCIOGCXrCT TOGOOGCASA 3360 

GCTCGACAGG CXAGAGGCX3V CICTCICOGC GfOhQCCGSG GAOGCICAAG CAOGOCX^GAA 3420 

GATCATATTA CGCCTGCAAT CCTGGTTGTC GAACTGGaGC GACGCTCAGG CTGCOGACGC 3480 

^TGGACCGATT CTGGGCAAGG ATETCAAGIC TOCXAOGAAG GAAGAGCICT TCX3CTGCTTG 3540 

TGAOGAAGCG TTCGGAGGOC TOGGXAAATC AAXAAOGAOG AGAAGCTTGT CrOCIACX:iA 3600 

CAGCAGGOGA TGAA!IQU3CT TCAGOGIGCT CA!ICAGOOOC TCCX30G0GGT OGAAGAGAAG 3660 

GAGCAOGAGC OCATOGOCAT OGIGGOGA!IG AGCIGOOGCT TOOOGGGOGA OGIGOGCAOG 3720 

CCCGAGGATC TCTGGAAGCT CTTGCOX^GAT GGGAAAGAUG CTAHdOOGA CCTTCCCCCh 3780 

AAOOGTGGTT GGAAGCTOaA CX30GCTOGAC GTOCACGGTC GCTCCCCAGT OOGAGAGGGA 3840 

GGCTTCTTCT AOGACGCAGA OGCXnmCGAT <XX3GC3CTTCr TOGGGATCAG 00CA0GCGW3 3900 
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GCGCTCGCCA TCX3VTCCCCA GCAGOGGCTC CTCCTCGAGA TCTCMX3GGA AGCXTTOaG 3960 

0GTGCX3GQCA TOQ^CXXTTGC CTOQCIOCAA GGGAGOCAAA GOGGOGTCTT OGTOGQCCTG 4020 

AXACACAAOG ACXAOOm: ATTGCIGGAG AA0GC3VSCTG GOGAACftCAA AOGATTOGTT 4080 

TOCAOOGGCA GCACAGCXSAG OGIOGOCTOC (jGCCQSKTOS OGXAXACAZTT OGGCTTTCAA 4140 

GGGC00G0C3^ !raV9C3GIGG^ CAOGGOGIGC AGCIOCIOGC TOGIOGOGGT TCAOCTOGOC 4200 

TCCCAG0OOC TOOQOCGTGG OGAKTOCTOC CTGQCX3CT0G CCGQOQGOGT GAOCCTCaaiG 4260 

GCCACX30CAG CftGTCTTCGT CQOGTrOGAT TCOGAGAGCG OGQGCGCXXX: OGATGGTCGC 4320 

TGCAAGTOGT TCTCGGTGG^V GGGCAAOGGX !ICGGGCTGG6 OCXaGGGOGC CXSGGATGCTC 4380 

CXGCTOGAGC GOCICICCGTl TOOCGTOCAT^ AAOGGICATC OOSIOCICGC OSIOCITOGZV 4440 

GGCXC0GCX3G TCAftOCAGGA OGGOOGGAGC CAAGC^XTEOV CCGCGOOCMi HGGCCCTGCC 4500 

CftACSyGCGOG !EC&ICOGGCA AGOGCICQAC AGCX^OGOGGC TCACTOCAAA GGAOSIOffiC 4560 

GTCX^rOGAGG CXCAOGGCAC GGGAAOCAOC CTOGQ^SACC CXSOCSAOGC ACAGGOCATT 4620 

CTTGCCACCT MGQOGAGQC CX3OT0CCAA GftCAmCCCC TCTGQCTTCG AAGTCTCAAG 4680 

TCXIAACCIGG GACASCGCICA GGCX3G0GG0C GGOSIGGGAA G0C3ICATCAA G&XGGIGCIC 4740 

GOGTTGCAGC AAGGCXrTCTT G0CX3^AGA0C C3XX»!rGCXX: AGAA!rOOCIC CXmSOVTC 4800 

GACIGGTCIC OSGGCAOGGT AAAGCTCXnG AAOG»GCXXX> OIOGICIGSC mOCAAOGGG 4860 

CA3XXIT0G0C AOGOOGGOST CI X 13GCX.Ti C GGCA!ICIO0G GCAOCAAOGC CCMXSSCKrC 4920 

CTCGAAG?^ COCXTGOCAT OGCCX^GGGTC GAGCXXXKS^ OSTCACAGOC OGOGTOOGAG 4980 

CCGCnCCOG CAGOGTGGCX: O^IGCICCIG TCGGOCAAGA. GCX^AGGOGGC 0610000300 5040 

CAOOCAAAGO GGCIOOGO^ CCACXnXXnC G0CAAAAG06 AGCIOGOOOT OGOOGAIGIG 5100 

GOOXATTOOC lOSOGAOCAC GOOOOOOCAC TTOGAGCAGO GOOOOOCICT OCIOOICAAA 5160 

GGCOGOraOO AGOIOCTOIO COOOOIOGAT GOGOIGGOOO AAGGACATTO 0G00G008IG 5220 

OTCGGADGAA QOQGGQCCCC AGGAAAGCTC QOOGTOCTCT TCAOQOQOOA AGGAAQOCAG 5280 

CGQCCCAOCA TQGQOOGOGG OCTCTAOGAC GTTTTCCCCG TCTTCOGGGA OOOCCTOGAC 5340 

AOOGTOGQOG OOCADCTOGA COOOCaOCTC GAOCGOOOCC TOOGOGAOGT OCTCTTaOOT 5400 

OOCQ^OOOCr COraGCAGGO OGOOOOOCIO GAGCAAAOOO OCTICAOOCA OOOGGOOCIG 5460 

TTTGOOCTOG AAGTOQOOCT CITTCAGCTT CTACAMOOT TOQGICTGAA GOOOQCICIC 5520 



wo 95/33818 



PCT/IB95/00414 



-144- 

cionxasGAC ioxrwrrGG cgagctogtc Gcxmxacxs togcx:qgogt cxottctctc ssbo 

CAGGACX^GCT GCAOCCTOGT OGOOGOCTGC GCAAAGCIOV TGCAAOOGCT CTCACAAGGC 5640 

GGOGOCATGG 7CAC0CI00G AGOCTOOGAG GAOGAAGTCC GCGAOCTTCT COGOCCm: 5700 

mAGGCOGAG CTAGOCTOGC CX300CTCAAT GGQOCTCTCT CXaOOGTOGT OGCTQGOGKT 5760 

GAAGACGCGG TGCTGGAGAT CGCCCGCCAG GCCGAAGCOC TCGGACGAAA GACCACAOGC 5820 

CTGOGCGICA GCXIACGCCTT CCATTCOOCG CACATGGAOG GAAIGCTOG?^ 0GACTT0CX3C 5880 

OGOGTOSCXX: ASAGCXrrCAC CXAOCATOCX: GCAOGCA!rOC CCMXXrCrO CAAOGTCAOC 5940 

GGOGOGOGCG 0CA0GGA0C2V OGAGCIOGCX: TOSCXX^GACT ACIGGGIOOG OO^OGTTOGC 6000 

CACACXX3TCC GCTTOCTOaA OGGCXSTAOGT GCXCTTCACG OOGAAQGQQC i«31GTCTTT 6060 

CTCGAGCTCG GGOTrCACGC TGTCCTCTCC GCCCTTGCGC AAGAOGCCCT 0G(3VCAGC»C 6120 

GAAGGCACGT CXSOCATGOGC CTTCCITCCC J^CCTCCGCk AGGGAOGOGA CGAOGOOGAG 6180. 

GO^riCAOOG OCGOGCIOGG OGCICTOCAC TOOGCAGGCA TCACAOCXS^ CIGGA0CX3CT €240 

TTCTECGCCC OCTTOGCICC ACX^CAAGGTC TCCCTCCCCh CCnSKSGCCTT OCAGOGOGAG 6300 

CX3CTTCTGGC CX::GA0G0CTC CAAGGCACCX: GG0GCXX3A0G TCAGOCAOCT TOCTOCXSCXC 6360 

raGGGGGGGC OICTGGCAAGC CKTCG^^QCGC GGGGACXTICG AIGCGCTCAG OGGICAGCTC 6420 

CAOGTGGADG GOGAOGAQCG GCGOGCOGCG CTOGCCCTGC TCCTTCCCAC CCTCTOGAGC 6480 

TTTCGCCACG AGOGGCAAGA GCAGAGCACG GTOOmn GGOGCTAOOG TA3X3UXnGG 6540 

AAGCCICTGA 0CA00GCX:GA AACACXX^GCX: »OCia30CX3 GCACCIGGCT GGIOGIOGIG 6600 

CXmXGCIC TGGAOGAOGA OGOGCICOCX: !EO0GQGC!ICA OOGAGGOGCT CAOOOGGOGC 6660 

GGGGOGOGCX; TCCTOGCCTT GCGCCTGMSC CAGGOCXSICC 'TGGAOCX^C^ GGCICIOGOC 6720 

GAGCATCTGC GCCAGGCTTG OGCOGAGACC GCXXXX3A3TC G0GGCX3TQCT CTm:TOCTC 6780 

GCCCTCGAOG AGCGCXXXXn: 0GCAGA0CX3T CCIGCCCTGC 0CX300GGACX CGOCCICICG 6840 

CTTTCTCTCG CTCAAGOOCT C3GG0GA0CTC GAOCTOGAGG OGOCXnGTG GTTCTTCAOG 6900 

OGGGGOGGCG TCTCCATTGG ACACICIGAC CXXTIOSCXX: AIOOOGOOCA GGOCASXSAOC 6960 

TGGGGCnGG GCOSOGICAT OGGOCTOGAG O^CCOSfO: GGIGGGGAGG TCIOGIOGAC 7020 

GTCTGCGCTG GGGTCGACGA GAGCGCXCTG GGCCGCTTGC TCOCGGCXCT OGCCGAGOGC 7080 

CADGACGAAG AOCAGCTOSC TCTCX3G0C0G GCCQGACTCT ACGCT0GCCX5 CATOGTOOGC 7140 

GOOCTGCICG GCXSAIGCX^CC TOOOGOGOGC (SCTTCAOGC OCX3GAGGC3VC CATTCICAIC 7200 
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ACCGGCQGCA COGGOQCCftT TQQOGCTCAC GT0QCXX3GM GQCIOGCTOG AAGRGQOQCT 7260 

CaGCACCTCXS TCCTX^TCAG OOG(30GAi3GC GOCGAGQCCC CTGGOQOCTC QGAGCTOCAC 7320 

GA06AGCICT CQQCCCIOSG OGOGOGCAOC ACOdOGOOG OGIGCGA3X3T CX3CCGACX:G6 7380 

AATGCTGTOG CXSOSCTTCT TGAGCftGCIC GAOGOOGAAG GGIOGCAGGT OOGOGOOGIG 7440 

TTOCAOSOQV GOGGCA!EOGA ACSUXSmTT OOGCIOGAOG CCRCCICITT CAGGGATCIC 7500 

GOOGKOGTTG ^TCTimSGCA?^ GGIOGAAGGT GCAAAGCACC TOC3^0G2VOCT GCTOGGCTCT 7560 

CGROOCCTCG imXTTTGT TCTCTTTTCG TCCQGCGCX3G COGTCTGGGG CX3G0GGACAG 7620 

CAAGGOGGCT A0GCGGCX3GC AAAOGCCITC CICGACGCOC TIGGOGAGC^ OX^GGCXSCAGC 7680 

GCJXaWTJXSA .a^^ GSIGGOCrGS GGOGOGIGGG GOGGOGGCGG CAXGGOCAOC 7740 

GATCAGGOGG CMjCOCtCCr OCAACAGOGC GGICTGTOGC GGASGGCXXT CICGCTTGCX: 7800 

CTGGOGGOGC TOGOGCIGGC TCIGGAGCAC GAOGAGAOCA OOGTCAOOGT OGCCGM^OC 7860 

GACIGGGOGC GCTTTGOGOC TTOGTTCAGC GOOGCICGOC 000GCXX3GCT CXn!GOGOGAT 7920 

TTGCOCGAGG OGCAGCGOGC TCTCXSAGACX: AGOSAAGGOG OGTOCTOOGA GCATGGOOOG 7980 

GC0CXX::GA0C TOCIOGACAJ^ GCIOOGGAGC OGCIOGGAGA GOGAG CA G CT TOGI C IGCT C 8040 

GICTOGCTGG TGOGOCAOGA GAOGGOOdC GT0CTCX3G0C AOGAAGGOGC CT00C3^IGIC 8100 

GAOGOOGACA AGGGCnOCT OSAIICTOGGT CinSATTOGC !tCAaX3G00GT OGAOCITOGC 8160 

OGGOGCTEGC AACAGGOCAC OGGCA!ECAAG CICXXX3GCX3^ CXXnCGCXTTT OSMXKTCCC 8220 

TCTCX:iX3VIC GAGnOGOGCT CTICTTGOGC GSOH^CTOG CXXZAOGOOCT OGGCAOGAGG 8280 

CTCTCOGTCG AGCCOGAOGC OGOOGCGCTC 0CX3G0GCTTC GCGCCGCGM^ OGACraGCXX: 8340 

Aia3CX3Via3 TOGGCAIOGC CCICCGCCIG OOGGGCGGOG TOGGOGAIGT OSAOGCniCTT 8400 

TGGGAGTTOC TGGQOCAGGG AOGOGAOGGC GITOGAGOOCA ^niOCAAAGGC CXXaAITGGGAT 8460 

GOOGCIGOGC lX:n3^a3NXC OraOCTOGAC GOCAAGAOCA AaOCIAOGT OOGGCAXGOC 8520 

GCX»!rGCTOG AOCAGGIOGA OCTCITCIGAC OCIGOCITCT TIGGCATCA6 OOCXXX^GGAG 8580 

GCCAAACACC TOGAOCOOCA GCACOGOCTG CTOCTOaAT CTGCCTGGCA GGOCXyrCGMi 8640 

GA0GCXX3GCA TOGTOCOCOC CAOCXnXIAAG QmmX3CA CX:GGOGroTT OGTOGGCA!rC 8700 

GGOGOCAGOG AKThCGCKn GOGACSGGOG AGCAOOGAAG KnCCG^OSO TTASGCXXmC 8760 

CAAGGCACX:G CSOGGGTCTTT TGOOGOGGGG OGCTTGGCXn? ACAOGCrrOGG CXn:GCAAGGG 8820 
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CCXX3CGCTCT OGGTCGACAC OQOCTGCTCC TCCTCX3CTCG TCGCXXTTCCA CXTTOGCCTGC 8880 

CAAGOCXnXX: GADVGGGOGTV GIGCAAOCTC GOOCICGOCG C3GGG0SICTC OGTCATOGOC 8940 

TOXCCGAQG QCTTOGTOCT OCTTrCCCGC CTGC3G0Q0CT TOQCX3CXXX3A OQQOOQCTOC 9000 

AAGACCTTCT OGGCX:?^AOGC OQ^aCTAC GGiy::GOGGAG AAGGOGTCAT OGTOCTTGCX: 9060 

CTCGAQCX3QC TOGCSTGaOQC CCT0G0CX33V GGACACOGCG aXTTOGCCCT CGTOOGOGGC 9120 

AtXGCCATCA ACCACGAOGG CGOGTOGAGC GGTKPCAOOG 00CXX3\A0GG CAOCTCCCftG 9180 

CAGAAGGTCC TOOQCGCOGC QCTOCAOGAC QC300QCAICA (X3C300QOOGA OGIOGAOGTC 9240 

GTOaAGTQCX: ATCQCACOGG CACCTOCTTG GGAGACCCX3^ !EaaVGG?IQCA AGCXXTOGOC 9300 

GCXXSrCEftCG 0CX3A0QGCAG A0CXX3CXGAA AaOOCTCTCC TTCTCGGOQC GCTCAAGROC 9360 

AftCMOQGCC ATCTOGAQGC CGOCIOOQQC CTCGOQQGCG TOQOCAAGAT OGTOGOCTOC 9420 

CTCCGOCATG AOGCCCTGCC CXXXACCCTC CACACGGGCC CGCGCAATCC CTTGATTGAT 9480 

TGQGMACAC TOGCCATCX3A OSTOGTTQ^ A0CXXX3AQCT CTTGGQCXXrG CXaCGAAGftT 9540 

AGCASrOOOC GCTGOGOOQG OGXCTOOQOC 1!T(»SRCTCT OCGGCMCAA OQCXXaOGTC 9600 

AICCTOGRGG AGGCX0CX3GC CJQOOCTGTaS GG0GAGCXXS3 OCACX^TCaCA GAOQGCXsJTOG 9660 

OGACCGCTOC COGCGGCXSTG TCCCGIQCTC CIGTOGQCrA GaOCGAQGC OQC30GTO0QC 9720 

GCXX3VGG0GA AGOQQCTOCX; CXaOCAOCTC CTOGOCCAOG ACGACCTCGC CCXTATOGAT 9780 

GTGGCCTATT 0GCAGGCX3VC C3U300GCGCC CACTTOGAGC AC0G0GCXX3C TCTOCTGGCX: 9840 

OGOGACOGOG AOGAGCIOCT CKTGOGCIC GACICGCICG CXX3USGACAA QOOCXSOCCCG 9900 

AGCAOCGTEC TCGGCCQSMS OGGAAGOCAC GGCAAGGIOG TCirO^ICTT TXTIGGGCAA 9960 

GQCTOGCftGr GGGAAGGGAT QQCOCTCTOC C3X3CIOGACr OCTOQCrGGr C Timj O J L n' 10020 

CAGCTOGAAG CAIG0GAG06 OGOGCIOGCT OCICAOGTOG AGTGGAGOCT GCTCG008IC 10080 

CTGCX^COSCG AOSAGGGCGC CCTCTCCCTC GACCGOGTCG AOGTCGTAGA GCCXI^CCCTC 10140 

TTTGOCGTCA TCGTCTCOCT GQCXXXXX3C TCGOGCICQC TOSQCCTOGA GCOCXSOOQCX: 10200 

GTOGICGGCX: ACAGCCAGGG OGAGATOGOC GCX^GOCTIOG TOGCAOGOGC TCICIOOCIC 10260 

GAGGAOGCX^G CGOGCAIOGC CGCCCIXSCGC AGGAAAGOGC rcACXACOGT OGGOGGCAAC 10320 

GGOGGCATGG OOGOOGTOGA GCTOGGCX^OC TCOGAOCICC AGAOCTACCT OGCTOOCIGG 10380 

GGCGACAGGC TCTCX:ACCGC OGCOGTCAAC AGCCCCAGGG CTACCCTCGT ATCOGQOGAG 10440 

COOGCX^GOOG !tOGAOGOGCT GCTOGAOGTC CICACOGOCA CCAAGGTGTT 0G0C30GCAAG 10500 
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ATOOGOGXOG ACTAOGCCTC OCACTCXGOC CAGATGGAOG OSAGCTOGCX: 10560 

GCAGGTCTAG CCAACATCGC TCCTCGGACG TGCQUSCTCC CTCITEfiTTC GAOOGTCACC 10620 

GGCACCAGGC TOSAOGGCTC OiT^GCICS?^ GGCGCGX^O? GGIATOGAAA CCTCCGGCMi 10680 

iUDOGICXriGT TCTOGAGOGC S^OOraGCX^G ClXnOGAOG AIGGGCATOG CITCICXSIC 10740 

G?^0GICftGOC 00CAT0CXX3T GCIGAOGCZC QCCCJXXXSCG iySAOCIGOSl GOGCICAOCG 10800 

CIOGAIOOOG TOGICGIOGG CIXXXTSCGA OGAQUUSAAG GOCAOCIOGC CXX30CIGCIC 10860 

CTCroCTGGG OSGaWSCICTC TAOOCGAGGC CTCX3CGCTCG ACTGGAAGGA CTTCTTOGCG 10920 

CXXTEACGCTC CXX^GCAAQGT CTCCXTTCCCC ACCTftOCSCrT TOCftQOGSRGA GOQGTTCTGG 10980 

CICGACGICT CCAOGGAOGTl AOGCTTCCXSl OGTCGOCICC GCAGGOCIQl CC r C GQC C Gh 11040 

ocaarcooQc toctoqqoqc oqcogtoscx: ttoqcxxsacc qoqqtqgctt tctcttima moo 

GGQCQQCTCT OOCICQCaGA GCACOOGTQG CTOGAAGQOC KCQCXXSTCrT OGQCaCAOOC 11160 

ATOCTROCQG QCACX^QGCTT TCTOGaOCTC GOCCTQCAOG TOQ00CA0CX3 OSIOGGOCTC 11220 

GACACOGTCG AAGAGCTCAC GCTOGAQGOC CX^CTOSCTC TCOCMOGCA GGACA0CX3TC 11280 

CTOCTCCAQ^ TCI00GIGG6 GCOOGIGGAC GAOGCAGGAC GAAGGGOGCT CTCmOCAT 11340 

AGOCXl^^CAAG AGGA0GCX3CT ^ICAGQVTGGC CXX^IGG^O^C GCCAOGCTAG O G GCTCICTC 11400 

nOGCOGGOG?^ OCXXATOOCT CIOOGOOraT CTOCAOGAGT GGCXnXXXHX: GAGIGCX»!IC 11460 

cx:ggtggaoc i^gm^qgoc:! cn^c^ ciogocaaoc tcgggcitcc ciaoggoooc 11520 

GAGTTOCAGG GCCSXSOGOIX: OGTCrEAOAG OG09GOG7UDG iySCHICITTGC OGAAGOCAAG 11580 

CTCCCGGAAG OGGOOGAAAA GGATGCOQCC CQGTTTQCXX: TOC3mnX3C QCTGCTOSaC 11640 

ASCGOCCTGC AIGCACIGGC CITIGAGGAC GAGCAGAGAG GGAOGGIOGC TCIGOXnTC 11700 

OX^GIGGAGCXS GAGICICGCT GCX;CI0C9GIC GGTGCXIJyOCA OCn!IGOGOGr GCX^CrTTOCAC 11760 

cgtccx:aagg gtgaaigctc asrcroGArc gtcciggctg aogoogcagg tgacxticit 11820 

GCXniOGGIGC AAGOGCIOGC CA3X30GGACG AOGICOGOCG CGCMSCTCCG CAOOOOGGCA 11880 

GCTTCCCACC ATGATGCGCT C3TC0GCGTC GACTQGAGOG AGCT(XAAAG CCXX3CTTCA 11940 

CCQCCIIQOCG COCOraGCGG CX?ICCrTCTC GGCACAGGOG GCCAOSVTCT OGOGCIOGAC 12000 

GCXXX:GCICXS COOGCTAOGC OGAOCTCGCT GCrcrCOGA^ GCGOOCID^ OCAGGGOGCT 12060 

TOGCXrrOOCX; GCXniOGIOGr OGOCXXXTTC ATOGAUCGAC OGGCAGGOS^ CdCGTCCCG 12120 
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AGOGCCCACG AGGCCACOGC GCTCGCACTC GCOCTCTIGC AAGCXnXSGCT OGCOGACGAA 12180 

CGCCTCGOCT CXSTOSCGCCT OGTOCTOGTC A0CCGACGCX3 COGTOQCrAC CCACAOOGAA 12240 

GAOQAOGTCft AGGACXnm: tECAOGOGCOG CTCTGGQGGC TOGOGOGCTC OGOGCAAAGT 12300 

GAGCACOCAG AOCIOXXaCT CTTOCICGTC GACATOGAOC TCAGOGAGGC CTOOCA0CAG 12360 

QCOCTQCTAG GOGCGCTOGA CACAGGftGAA OQCCAGCTOG OCCTOOQCAA OGGGAAAOOC 12420 

CTCarCCCGA GGTTGGOGCA ACCACGCTCG ACGGACGCGC TCATOCXX^ GCAAGCAOOC 12480 

ACGTGGCGCC TCCM!ATTCC GACCAAftGGC ACCTTOaAOG OQCTOQCXXn! CGTOaAOQOC 12540 

CCCGAGGCCX: AGQCXSGOCT OQCaCAOSGC CAAGTOOQCA TOQCXX3IQCA OQOGGCAQGG 12600 

CTCAftCTTOC GOGMGTOGT OGACaOCXOT GGCAIGIftTC CX^QQOGRCGC QCCGCCGCTC 12660 

GGaGQOSBAG GOQOGGGCftT OGTXACTGAA GTCGGTOCaG GIGSCtCOCG AEACftCCGTA 12720 

GQCX3ACCGGG TGftTGGGGGT CTT0GG0GC31 GCX^TTTGGTC OCaOSQCXaT OQOCGAOQCC 12780 

CX3CATGATCT GCCCCATOCC OCACGOCTGG TCCTTCGCrC AAGOCX^CCAG .OGTCCCCATC 12840 

ATCIATCTCA CXX3CX}330A TGGACTOGTC GMCTOGQQC ATCIGAAACX: OMCAAOGT 12900 

GTOCTOVIOC tiaSXXSGCCGC OQGCX^QOGTC GQGftOGGOOG CXCTTCAQCT OGCAOSCXaC 12960 

CTCGGOQCOG AGGTCTTTGC CftOOGOCRGT CGAQGGAftGT G(3RG0QCICP CX^OGOQCTC 13020 

GGCTTOaftOG MQOGCAOCT OQOGTCXnCA OGIGAOCTQG GCTTTOGaGCA GOOTOCTG 13080 

OGCTOCAOQC ATGGGOGOGG CftTGGKPGTC GTCCTOSACT GTCTGGCACG OGRGTTOGTC 13140 

GACGCCTCGC TGCGCXH^CAT GCCGAGOGGT QGftCGCTTCA TOGAGR3K3QG AAAGACGGAC 13200 

AXCOGTGftGC CXXaftOGOGAT CQQCXTOGOC TftOOCIQQCG TOSrmOCXS OSCCrrCGRC 13260 

GTCftCAG^ C^OGGftOOGGA TOGAATTGGG CMSKSXaCTCG CAGAGCIGCI CAGOCICITC 13320 

GAGOQOGGTG TCCTTOGTCT GCXftOCXMC ACMOCTCGG ACSOOOGTCA TOCXXTOCftG 13380 

QCCTTOCQCG 0GCT0GCXX3^ GGOQOGQCftT GTTQQGAAGT OnSTOCTCAC C3mKXXX3GT 13440 

CX:GATCGATC CCGAQGGGAC OGTCCTCATC ACGQGAGGCA CrOGGAOSCr ASGAGTOCTG 13500 

GTCGCACGCC ACCrOGTa3C GAAACACAGC GOCAAACACC TGCIOCICAC CIOGAOGAAG 13560 

GGCGOGCGIG CICCGGGOGC GGftGGCTCIG CQ^AGOGAGC TOGAAGOGCX GGGGGCXTTOG 13620 

GTCACOCICG TCG0GIGCX3^ CXSIGGCXXSAC OCAOGOGCXX: TOOGGAOCXn! OCTGGACAGC 13680 

ATCOOGAGGG AICATCOGAT CA0GGCXX3TC GIGCAOGCXXS CX^GGOGCXXTT OGAOGAOGGG 13740 

CCGCTCQGTA GCMGftGOGC CX3AQCGCATC GCTCGOGTCT TTGAOCXXaA GCTOGMQOC 13800 
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GCTTGGTACT TGCaTCaGCT CACOCftGGaC GAGCXX3GT0G CQGCXnTOGT CCl V TIt'l U j 13860 

GCCGCCTCCG GCGTCCTTGG TGGTOCAGGT CAGTOGAACT AOQXGCTGC CAATGOCTTC 13920 

CrOGATGCGC TCGCACAITCA OOGGOGOGCC CAAGGACICC CAGCXX3CITC GCTOGCTTGG 13980 

GGCIACIGGG CX^GAGOGCAG TGGGAIGAOC OGGCAOCTCA GCGCCGCCXSk OGOOGCIOSC 14040 

ATGAGGOGOG CCGGOGTCXX; GOOCXn!OGAC ACIGAOGAGG 0GCICKX3CT CTICGfaGrG 14100 

GCTCTCTTQC GRCGCGftGCC aSCTCTQGrrC CCOSCCCCCr TCSaCTRCAA OSKSCTCftGC 14160 

ACGAGTGCCG ACGGCGTGCC CrCGCTGTTC CAGOGTCTCG TOCGCGCIOG CftTCGCQCXSC 14220 

AAGGOCGCCA GCAATACTQC OCTOQOCTOG TOGCTTGCAG AQCaOCTCTC CICCCKXX^ 14280 

OTGCrGAAC QOGAGOQOCT OaXTTOGRT CI0GT0CX3CA OOGRftGOOGC CTCXX3T0CTC 14340 

GGOCTOGOCT OSTTCGMOC QCICGMXXC OOOGCXnC TACAAraOCT OGGGCIOSAT 14400 

IXXXmCAXGG OCXn?OGAGCT CXXSAATOGA CTOGOOGOOG OCQOCGGGCS GOGSGCICCAG 14460 

GCTACTCTCC TCTTCaClA TCCAAOOOOG ACTQOQCTCT CACQCTTTTT CaaSAOQCAT 14520 

CTCTTCGGGG GAACCAOCCA CCX^CCCCGGC GTACX^GCICA CXXXXSGGGGG GAGOGAAGAC 14580 

Crra!rCGCXA TCSTGGCGAT GAOCIGOCGC TTCXX3GGG0G AOSIGOGCAC GOCTGAGGAT 14640 

CICTGGAAGC TCTTGCIC6A OGGACAAGAT GCX»!l!CIOOG GCTTTCOOCA MNXXSCGGC 14700 

TGGAGICTOG ArGGGCIGOl OGO0OCX3GGT OGdOOrAG TCX^GGGAGGG GGGCTEOGIC 14760 

lAOGAOGCAG AOGCX^TTOGA nOOGGCXniTC TIOGGGA!ICA GIOCAOGIGA ASOGCTOGOC 14820 

GTOXSmXOC AACAGCX3CAT TTTGCTO^ A!rCACA!IGG6 AAGCX:TT0GA GCX?TGCAGGC 14880 

A!ICGACOOGG OCIOCXTOCA AOGAAGOCAA AGOGGGGICT TOGnOGOGT ASIGGGAGAGC 14940 

QOACCAAT GCA!IOGCIGG TS^AOGOGAC OXSGOGAAXAC AAGC3O!0GT imsmSGT 15000 

AGOGCAGOGC GTCOGTOOGG OOGAATOGCA TACA0GTT06 GAC^n^CAAGG G O OOGOCMC 15060 

AGCGIGGAGA CGGOGZGCAG CTOXXTOGTC GOGGITICACX: TOGOCIGOCA GGOCXXXXXX: 15120 

CACGGCQ^ ACTCOCIGGC GCTOGCTGGC GGOGIQ^OCA TCA3X3GC3CAC GCXAGOCAXA 15180 

TTCATCGOCT TCGACTCCGA GAGCGCGGGT GCOCCCGACG GTCQCTQCAA GGCCTTCICG 15240 

OCGGAAGOOG Aa3GTT0GGG CIGGGCOGAA GGCX300GGGA TGCICCIGCT OGAGOGOCIC 15300 

TCXXSAIGGOG TOCAAAAOGG OCAIICOCGTC CIOGOOGTOC TTCX3UGGCIC OGCX^GTCAAC 15360 

CAGGAOGGCC GGAGCCAAGG OCIGAOOGOG (XCMiOXjGCC CTG00CAG(3^ GCX30GICA!I!C 15420 
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CGGCAAGCGC TOGACAGOGC GCGGCTCACT OCAAAQGACG TOGAOGTOSr OGAGGCTCAC 15480 

GGCAOGGGAA OCACCCICGG AGAOOOCATC GAGGCACAGG CXX^TTTTTGC CAOCXA!IGGC 15540 

GAGGOOCATT CXX3VAGACAG ACXXX^XHIGG CTZGGAAGOC OCAAGIOCAiA CXriGGGACAT 15600 

ACXCAGGCTG GGGCX^GGOGT OGGOGGCATC A!ICAAGKIGG TGCIOGOGTT GCAGCAOGGT 15660 

CTCTTQCOCA ASAOOCTOCA TGOOCAGAAT CCXJrCCCCCC ACATOGAC3X3 GTCTOCAGGC 15720 

ATCGTAAAGC TOCTGAACGA OSCOGTOGCX: TQGACGACCA GCGGACA!ICC TCQC0GCX3CC 15780 

GCTGTTTCCT CGTTCX3G0GT CTCCX^GCACC AAOGOOCMG TCMXX30GA AGAGGCTCXX: 15840 

GOCGCCACGC GQGCTGAGTC AGGOGCITCA CAGCXnOTVT CQCAGOCXCT OOOOGCGQOG 15900 

TGGCXOGTCG TOCTGTOSGC CAGGAGOGAG QOCGOCXSrCC QOGOCXaGGC TCAAAGQCTC 15960 

OGOGAGCAOC TQCTOQOCCA AQGOGACXHIC AOCXTOQOOG /mSTOGOCIA TTOGCTQGOC 16020 

AOC3\CCCGCX3 CCCACTTCGA GCaCOQOGCC GCTCTCGTAG OXIAOGACCG OGACGAGCTC 16080 

CTCTCCGCGC TCGACTOQCT OGCCCAGGAC AAGCXXX3CAC 0GAGCA0CX5T OCTCGGAOGG 16140 

AGOGGAAGOC AOGGCAAGGT OGICTTCXSTC TTIOCIGGGC AAGGCIOGCA GIGGGAAGGG 16200 

iCEGGCOCICT CXriGCICGA CIOCIOGCXX: GICIT0CX3CA CACAGCT0G7V AGCAIGOGAG 16260 

OGOGCGCIOC GICCICAOGT OQ^GIGGAGC CIGCTOGOOG !ra:a:GC3GOCX3 OGAOGAGGGC 16320 

GOCCCCTOOC TOGAOOQOGT CGACGTOSIG C3\G00CG00C TCTTTGCCX3T CATGGTCTCC 16380 

CTGGCOGOCC TCTGGOGCTC GCTCGGCGTC GAQCXX»XX3 CXXJrCGTOSG CXaCAQOCS^ 16440 

GGCGAGAXAG OOGCOGCXTT OGZICGCAGGC GCICICIGCX: !E0GAGGACX3C GGOCX^GCAHC 16500 

GCCGCCCSXX: GCAGCAAAGC GICACCAOOG TOGCX^GGCZA OGGGCAXGGC 0GCX3GI0G»G 16560 

CICGGOGOCT COGACXnXXIA GAOCIACCTC GCIOCCIGGG GOGACAGGCT CICCA!EOGCX: 16620 

GOCGTCAACA GCXXXAGGGC CAOGCTOGTA TOOGGOGAGC CTGCX^GCX^GT OGAOSCXSCIG 16680 

ATCGACTCGC TCACCGCAGC GCAGGTCTTC GCCOGAAGAG TOOGCGTOGA CTAOQCCXCC 16740 

CACTCAGCCC ASATGGACGC OGTOCAAGAC GAGCTCGCXX; CAGGICTAGC CAACATOSCT 16800 

CCimSACGT GOGAGCKXX: TCTTTATIOG AOOSrCACXXS GCAOCAGGCT 0GACX3GCI0C 16860 

GAGCTCGAOG GCX^OGIACTG GIA3CGAAAC CTCOGGCAAA C30GI0CIGTT CICGAGOGOG 16920 

ACOGAGOGGC TCCIOGACGA TGGGCAXCXSC TTCTIOGTCG AGGICAGCOC TCA!EO0CX3TG 16980 

CTCACGCTCG CCCTCCGCGA GACCTGCGAG CGCTCACCGC TCGATCCOGT OGTCGTOQGC 17040 

TCCMTCGAC GCGACGAAGG CCAOCTOCCC CGTCTCCTTG CTCTCTTGGG OCGAGCTCIA 17100 
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TOQCOGGGOC TCRCGOCOSA GTGGAAQQCC TOXrrrOQOGC CCTTOGCTOC OOGCAAGCSrC 17160 

TCACTCXXX:a CCTTWCX^CCTT CX^^GCGCGAG CCnrrCTGGC TOGftCGOCOC O^AOSCACAC 17220 

CCCGPAGGCG TOQCTOOOGC IGCGCCGKSX: GATQQQCX3GT TTTQQCAAGC CRICXaAOGC 17280 

GGQGRCCTOS AOGOGCICAG CQQOCAQCTC CftOGCQGaCG GCX3A0GAGCA QOQOGOOQCC 17340 

CTOQCOCTQC TOCTTOCXaJC CCTCTOGRQC TTTCAOCaOC AGOGCXIAAOI GCAGAGCAOG 17400 

GTOGACftOCT GQOGCEAOOG CMCaOGIGG AGGC3CICT(3V OCACXa3CXX3C CaOQCrOQOC 17460 

GACCTOGCOG GCAOCTGGCT CCTCGTCGTG CCXSTCOGCGC TOQGCGAOGA OQOQCTOOCT 17520 

GCCACGCTCA CCXSOGCGCT TACCCGQCX3C GQOGOQCGTG TCCTOSCGCT GOGOCTGftGC 17580 

CAQGTICACA TAGQOOGCGC QQCTCICAOC GAGCaOCTGC QCGaOGCIOT TOCOGAGftCT 17640 

GCCCCffiTTC GC5QGCGTGCT CTXrCTOCTC GCXXn!aaCG AGOQCXXXCT OGOQGACCAT 17700 

GOCQCOCTGC CCQOQQQCCT TQOCCICTOG CTCGCCCTCG TOCMGCXTT CQGOGAOCTC 17760 

GOCCTCGRGG CTCCCTTGIG QCECTTCACG CQCGG0GCX3G aJCTOaTTGG ACACTOCGAC 17820 

CXaCTOGCXX: ATCCCACCCA G^XMXaTC TGQGGCTTGG GCCGCGTOGT C5QQ0C1CGAG 17880 

CACCOCGAGC QGIGQQQC3GG GCTOGTOGAC CTOGQOQCftG OGCTOGftOGC GaOCQOCGCA 17940 

GGCOQCTTQC TCOOGQCCCT CX30CX3U3C3QC CAOSaOGAAG AOCaGCTOQC QCIGOQOOOG 18000 

GCCQGaCTCT AOQCaOQOOG CTTOSIOOQC GCCCCGCSCG GOG3VraOQOC TOCXX3CTCQC 18060 

QQCTTCKEQC 0CC3GAGQCftC OVTCXTCATC AOOQGTGCTA CCGGCQCXXT TOGOQCTCAC 18120 

GTOSCCOGAT GQCTOGCTOG AAAAGQCX3CT GAGCACCTCG TCCTCATGAG CCXSaOGAGGG 18180 

GCCXIAGGOCG AAGGOQOOGT QGAGCTOCAC GCXXSAQCTCA OCXSOOCIOSG CQCGCGOGIC 18240 

ACCTTCGCXX3 OGTGOGMGT OGOOGftCftGG AQC3QCTGT0G OCaCGCTTCT OGRGCaOCTC 18300 

GACGOCQGftG QGOCaCftGGT GAQ0QCXX3TG TTOCAOQOQG GCQQCMaa QOOOCaOGCT 18360 

CGGCTCQOOG CXaCCIOCAT GGAGGaiCTC QCXSGAQGTTG TCTOCX»aA QGIftCaAGGT 18420 

GCAAGACAOC TCCAOSiOCT GCTOQQCTCT OGACCCCTCG ACGCCTTTGT TCTCTTCTCG 18480 

TCCGGCGCQG TCGTCTGGGG CGGCX3GACAA CAAGGCGQCT ATGCCX3CTGC QVACGCCPIC 18540 

CTCGAIGCOC TQGCCGAGCA GCGGCGCAGC CTIQGQCTGA OQQOGACKIC QG3X3QOCTGG 18600 

GQCGTGTOGG GCX3Q0QG0QG CMX3QCXACX: GQGCICCTGG CAGOCCAQCT AGAGCAAOSC 18660 

GGTCTOTOQC OGftTGQCX«: CTOGCTGQOC GIGGOGAOGC TOQOQCTQQC GCIGGftGCAC 18720 
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raCGAGACXIA CXXnCAOOGT OGCOGACATC GAC1X3GGCX3C GCTTTGOGCC TTCGITCAGC 18780 

GCCGCIOGCT CXX:GCXX:GCT CCIGCaCGKr TEGOCOGAGG OGCAGOGGGC TCICGAAGOC 18840 

AGCGCOGAUG OGTCCICX^ GCMGAOGGG GOCACAGGCX: TOCIOGACATl GCKX^GAAZVC: 18900 

CQCTCGGAGA GOGRGCAGAT CCACCIGCTC TOCTOGCTGG TGCGOCAOGA MSCOSCCCTC 18960 

GTCCTGGGCC ATAOCXSACGC CTCXXaGGTC GAOCOCCACA AQGGCITCAT GGA0CIC3GGC 19020 

CTCGATTCGC ax:A!ZGACOGT OGAGCTTOGT OGGCXSCTIGC AGCAGGCX:AC OGGCATCAAG 19080 

CICXX3QQC3CA OOCTOGCCTT OGAOCATOO: OXTOCICATC GOGT O XXSCT CTTCTOIGOGC 19140 

GACTOQCTOG OCCfiaSCCCI CGGCGCGRGG CTCTOCTOG AGOGCGAOGC OQOOGOGCIC 19200 

CXX^QCGCTTC QCTOGGOGAG CGAOSAGCOC ATCQOMCG TO3GCMGGC CCTOOSCTTG 19260 

CX:»GGCX;QCA TCGGCGATCT CGACGCTCTT TCQGAGTTCC TOGOCXaAOG AOSOGAOGOC 19320 

GTOSAGCXXA TTOOOCftirQC OOGATGGGAT GOOQGTGCXX: TCTAOGAOOC CGAOXnaC 19380 

GCXaAGGCCA AGAQCIADGT O0GGC3VTQCX: QCOOGCICG AOCAGGIOaA CX^CT^^ 19440 

CXn!GCCTTCT TIGGCATCAG OOCTOGOGAG GOCAAATAOC TOGACCXXXa^ GCAOC3GCX3!G 19500 

CECCTOSAAT CTX3CCTGQCT QGCCXTOGAG GAOGOOQGCA TOSTCCOCTC CAOrTCAaG 19560 

GATTCroCCA COGGOGTCTT CGTCGGCATC GGCGCCAGOG AAIAOGCaCT QOaAACAOG 19620 

AGCTCOGAAG AGGTCGAAGC GTA3X300CTC CRAGGCAOOG CCX3GGT0CIT TGCXX3C3QQQG 19680 

OGCTTGGCCT ACAOQCTOGG CCTGCAAGGG OCX330QCTCT OGGIOGACAC OSOCTGCIOC 19740 

TOCKX^CICG TOGCrCIOCA CCnXXjCCTGC CAAGCXTTQC GACAGGG0C31 GIGCAAOCIC 19800 

GCCCICGOCG OQQQOGICTC OGTCATGGOC TOOCXX^QQGC !ICTTO3T0Gr CXOTTCOOGC 19860 

AIGOGTGCTT TGQ0G0CXX3A TGGCXX3CTCC AAGAOCTTCT OGACCAAOGC OGAOGGCTAC 19920 

GGACGCGGAG AGGGCGTCGT CXSTCCTTGCT CTCGAGCX3GC TOGGCGACGC OCTCGOOCS^ 19980 

GGAGAOm; IXTTOGCXX:! CG^ 20040 

GGCAHCACXX; OCOCCAAITGG CAOCTOOCAC CAGAAGGIOC TOOGOGCXXX; GCICXAOGAC 20100 

GOCCAIATOG GCXX7CGC0GA OGIOGAOGIC GIOGAAXGCX: AIGGCAOOGG CAOCICXriTG 20160 

GGAGACOOCA TCGAGGIGCA AGOCXTTGGCC GCOGTCIAOG OOGASIGGCAG AOCTGCIGAA 20220 

AAGCCTCTCC TTCTCGGOGC ACTCAAGACC AACATTGGCC ATCTCGAGGC CGCXn?OCGGC 20280 

CTCGCGQGCG TCX3CCAAGAT OGTOQCXnCC CTCCGCCATG AOGCXXTOCC OCXXaOCCTC 20340 

CACAOGACOC OGCGCAATOC OCIGATOGAG TOGGATGGGC TCGCXATOGA CGrCGSCGKT 20400 
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GCCACX3AGGG CGTGQGCCCG CCACGAAGAT GGCAGTCOCC GCOGCGOOGG OGTCTOCGCC 20460 

TTCGGACTCT OOGGC?^OCAA OGOOCAOGTT ATCCTOGAAG AGGCTOCX^GC GA1C00GCAG 20520 

GCCGM3CCX3i OOGOGGCACA GCIOGOGIO; CAGOCX^CTIC OCX^CAGOCIG GCXX^GIGCIC 20580 

CTGICX3GOC7^ GGAGCQ^GOC GG00GIGCX3C GCXXAGGOCX: AraGGCTCXXS OGACCAOCTC 20640 

CrOQCXX2M3G AOGACCTOSC CCTOGCCGKr GIAGCXICACT OQCIOSCCAC OOXXSGGCJ: 20700 

ACXnrrCGAQC AOCXSTGCOGC TCTOGTGGTC CAa33\OOGCX5 AAQiGCTOCT CTCOGCGCTC 20760 

GA3TCGCTCG CCCAGGGAAG GCOCQCOCOG ASCAOOGTOG TOGAAOGAAG OQGAAGCXaC 20820 

GGCAAQGTCG TCTTCGTCTT TCCTGGQCaA GQCTOQCAGT GGGAAGGGM GGOCCTCTCC 20880 

CTGCTCGMA CCTOQCXXKr CTTOOGGQCA CftGCIOSAAG CX3IQ0GAG0G OQCXXnCQOG 20940 

OOCCAOGTGG ACTQGTOQCT QCTOSCXSGTG CTOOQCQQOG ASGaGQQOQC QOCXrOGCTC 21000 

(aOOGQGlOG AC3GTGGT0CA GCXX3G0QCTG TTCTOGKTm TGGT CTO GCT GGOCGCCCTG 21060 

TGGCGCTCCA TQQGCGTOGA GCCCGAOGCG GTGGTOQQOC ATAGCX3U3GG CGAGATOSOC 21120 

GCGGCXTOTG TGGCGGGOGC QCTGTCGCTC QU3GA0GCIG OCMQCIQGT QQCGCTQCQC 21180 

AQCOGIGOQC TOGIGGRGCT CGOOGGOCftG GGGGCXaiGG CTQOGGIGGA QCTGCOGGAG 21240 

G0CXSU3GI0G CAOGGOGCXTT OCAGOGCZAT GGOGATOGGC HCIOCATCGG GQO G ftlCRAC 21300 

ASOCCIOGTT TCAOaROGair CTOOQGOGaG OCXXXn!GOOG TOQCXXXXCT GCTOOQOGRT 21360 

CTGGAGIOOG AQGGOGTCTT CGC30CTCAAG CTGAGTTAOG ACTTOQCXrrC CCACTOC3G0G 21420 

CAGGTCGAGT OGATTOGOGA CGAQCTCCTC GRTCTOCTGT OGTQQCTOGA GCX^GOGCTCG 21480 

ACGGOGGIOC OGTICXACIC CACX9GIGAGC GGOGCOGOQ^ TCGAOGGGAG CXaG C TOGAC 21540 

GCCGCCnCI GGTAOOGGAA OCICXX^GCAG OOGGIOOGCT OtXSCACaOGC TG3XXAAGGC 21600 

CTCCTIGCCG GRiaACfiTOG CTTCTTOGTG GAGGIGAGOC CXaGTOCTGT GCTG»OCTTG 21660 

GCXrnXSCAOG AOCTOCTOSA AGOGIOGGAG O GC TOGGOG G OGGIGGIOGG CnCTCT G IGG 21720 

AGCG?\OGAAG GGGATCTAOS GOGCTTCCTC GTCTOGCTCT OOSUSCTCTA aSTGAAOaQC 21780 

TOmXCTGG ATTGGACGAC GMCCTGCCC OOOGGGAAGC GGGTGCTGCT QCXXaCCTAC 21840 

OOCTTOCAGC GOGftGOGCTT CIGGCICGAC GOCIOCAOGG CAaX:GOOGC CGGCX3ICAAC 21900 

CACCTEGCTC OGCICGAGGG GOGGTTCIGG CAOGOCA!I!OG AGAGOGGQUl TATOGAOGOG 21960 

CICAGCX^GCC AGCIOCAOGT GGAOGGOGAC GAGCAGC3G0G OOGCXTTTGC OCIIX^CICCTT 22020 
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OXaCCCTCG CraGCTTTOG CCAOGMCXSG CAAGAQCftOG QCaOGGTCGA CQOCIGQOGC 22080 

raCCQCATCA CX3TGGAAGCX: TCTGAOCACC GCCAOCaCGC OOQCXXaOCT QQCCQQCACX: 22140 

TGQCTCCTOG TOGTGCOQQC CQCTCTGGAC GAOmOSOQC TOOOCIXXSQC QCTCAOOCaG 22200 

GOQCTCmr GGCGCGGCGC GCQCGTOCTC GOOSTGCGCX: IGAGCCAGGC CCACCTGGAC 22260 

CGCGAGGCTC TCGCCGAGCA CCTGOGOCAG GCTTGOGOOG AGAOOQCGCX: QCCTOGOQQC 22320 

GTGCTCTOGC TOrrOGCCCT OGAOGAAAGT OXXnXSSCCG ADCKCGOOQC CGTGOCX^CG 22380 

QGACTOQCCT TCICGCSCfC CCTCGTCCAA GOCCTCGQOG ACMmXXrr OGAOGOGOOC 22440 

TTGTGQCTCT TCACCXS3CGG OSCOSTCTOC GIOQGaCACT COGRCCXXair CGCXX3VIO0G 22500 

POSCMpGCGA TCAOCIQQGG CrTGGQCX3GC GTOGTOGQCC rCGM3Cf€CC OGaGOGCIGG 22560 

GGAGQQCTOG TOGACGTCGG CGCAGCGATC GACGCGAQOG COGTOGGCOS CTTGCTC300G 22620 

GTCCTCGCCC TGCGCAACGA TGAGGACCAG CTCQCn!Cia: QCX50GGC0GG GTTCTAOGCT 22680 

CGOOGCXrrOG TOOGCGCTCC QCTOGQOGftC GOGCX^SOCG CROGTROCra C3^AQC30COGA 22740 

GQCACOCTOC aOVTCAOOGG AGGCAOOQQC GCXX3C3X3QOG CICAOGXOQC OCXMGQCTC 22800 

GCTOGaGAAG GOQCAGAQCA CCTCXSTOCTC MCAQCOSCX: GAQGQGCCCA QQOOGftQGQC 22860 

GCXrrOSGAGC TCCACGOCXSA GCTCAOSGOC CTGGGOGCGC GOGICADCTT OGOCGOGTGT 22920 

GATGTCGOCG ACAGGAGOGC TGTCGCCACG CTTCTOGAGC AGCTOGAOGC OGAAGGGTOS 22980 

CAGGTCOGCG COGTGTTCCA CQOGGGOGGC MaSQGCGOC AOSCTCXXSCT OQCmxaOC 23040 

TCTCTCaiTGG AGCT0GCXX3A OGTTGTCTCT GCXaAGGTOC TaOGOGCAGG GaAOCTCCRC 23100 

GAOCIQCTCXS GTCXnXXSaCX: CCTC3GR0QCC TTOgl O Jm' trCTOGTCXaT OQCaGQOGIC 23160 

TOGQGOGGOG GACAACAAGC OQGMAOGCSC GCXaSGAMCSG OnTOCTCGA OGCXXHIGGCX: 23220 

GACXIAGCGGC GCAGTCTTGG ACAGCCGGAC AOGTCCGTGG TGTGQGGCGC GTGGGGCGGC 23280 

GGOGGTQGTA TAITTCAOGGG GCCOCTGGCA GOOCAGCTQG AGCAACGTOS TCTGlHaCXXS 23340 

lOGGCCCCn OGCIGGCOGT GGCGGCX3CIC GOGCAAGOCX: TGGAGCACGR OGAGACTACC 23400 

GTCAOOGTOG CX^GftCATOSA CTQGGOGOQC TTTGOQOCTT OGATCAGOQT OGCTOGCTCC 23460 

OGCOQCTCCT GCXSCGACTTG CXXX3AGCAGC GOGCTCTOGA AGACASAGAA GGOSOGTOCr 23520 

OCTCOGAGCA CXSGCCOGGCC CCCOSACCTC CTCGACAAGC TOCGGAGOCG CTCGGAGAGC 23580 

GAGCAGCTCr GTCTGCTOGC OGCGCTGGIG TGOGAOGftGA OGQCXnOGT OCTOGGOCAC 23640 

Q^AQGCCGCT OICOCAGCTOG ACrCCGACAA GGCTTCTTOG AOCTOGGTCT CXMTOGATC 23700 
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MGAOOGTOG AGCTTCGTCG QOGCTTQCAA CAGGCCACCG GCATCAAGCT 0C0QGCC3\CC 23760 

CTCGCCTTOG ACCAaXXDCTC TCCTCaiaSC GTC3G0GCTCT TCMQOGOGA CTOaCTOQCX: 23820 

OVOGOOCIGG GCAOraOGCX CTC0GCX3GAG GOGAOGOOQC 0G0GCI00G6 OOGOGCX3X::G 23880 

AGOGAOGAGC CCXICGCaU OGIOGGCATG GCXXnXSOGOC TGOOGGGOGG OGICGGCXS^ 23940 

GEOGAOGCrc TTTGGSU3XT OCOOilAOCATV GGGOGOGAOG OGGIOGAGOC CATTCCACAG 24000 

AOOOGCax;GG AOGCOGQTGC OCTCZACXaAC CCCGMXXXXS AOGOOGAOGC CAAGAGCTAC 24060 

GTCCGGCATG CXX30GATGCT CGACCAGRTC GACCTCTTOG ACXXHX^CXOT CTTOGGCMC 24120 

AQCXXXX23GG AQGOCaAACA CXnaaOXC Cf^GOOCGOZ TGCTOCTOGA MXaiGOCTGG 24180 

CTQGCXXnTO MGACGCX33G CftTCGTOXX: AO^CCCTCA AGmCTCCTT CAOOQQOGTC 24240 

TTOSrOGGCA TCTGOGOOGG OSAATAOGOG AIGCAAGA06 CXSAGCIOGGA AGbTJ O O GA G 24300 

GITTACTTCA TOCJ^AQGCAC TTOOQOGTCC TTTGQ0QC3GG QQGQCrTQGC CTAIIAOQCTC 24360 

GQGCTCCAQG GQOOQOGATC TTOGGTOGAC AOOGCCTGCT CCTCCTCGCT CX3TCT0CXn?C 24420 

CACCTOGCCT GCXAAQCCCT OOGACAQQGC GAGTGC3iA0C TJSOCCICGC OQOQGQOGIG 24480 

TCXXnCATQG TCTCXXXXX31 CaCXnTOGTC imSCTTTOCC CTCIGOQCGC CTTQQCSQOCC 24540 

rajOGGCX3GCT OCAAGACCTT CEOGOO^ GCXXsAOGGCT f€GGMJ30BG AGAAGGOGIC 24600 

GTCXriCClTC CXnaaOOG OVIOGGCGAC GOOCIOGOOC GGAGACAOOG OGIOCXGGIC 24660 

CT0GT0CXX3G GCA0CQCX3Vr CAADCaOGAC GGOGOGTOGA GOGGIATCAC OSXXXXXMC 24720 

GGCACCTOX: AGCAGAAGGT CCIXXGQGCX: GOGCTOCACXS ACOTCCGCaT CAO0OC3OGOC 24780 

GAOGTCXSAOG TOGTOGASIG CX:2m3GCAOC GGCAOCICGC TOGGA6A00C CA!IOGAGG!rG 24840 

CAAGCXriGG CXm^GICIA 0GCX:GA0GGC AGAOXGCIG AAAAGOCZCT CXTETCIECGGC 24900 

GOGCTCAAGA OCAACAICGG CCKJCCTCGMi G00G0CICXX3 GCCTECGOGGG OSICGCXAAG 24960 

atogtogoct oqctcosoca cgaogcxxttg ocxxxxaocx: TCcaoocxsAc ocxaoGcaAT 25020 

OCCCTCATCG AGTGGCS^GGC GCTCGCCATC GAOGTCGTCG MADCCCGAG GOCTTGGCXr 25080 

CGOCAOGAAG ATQGCAGIOC OCQCOQOGCC QGCATCTCOG OCTTOSGATT CTOSGGCAOC 25140 

AAOGOOCAOG TCATOCIOaA AGAGGCIOOC GOOGOOCTGC OGGOOGAOOC CX^CXAOCICA 25200 

CAGOOGGOGT OGCAAOOOGC TOOOGOSGCaG TGGCXX33TGC TCCIGTOGGC CAGGAOOGSM? 25260 

GCX3G00GICX: GOGCXX3U3GC GAAGOGGCIC OGOGAOCAOC TOGIOGCXXIA OGAOGACXnC 25320 
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A0CCTCX30GG MX3TGGCCTA TTOGCTGGCC ACCACCXX30G <XX»CTTOGA GCaC30G0Q0C 25380 

GCTCrOGIAG OOCACAAOOG OGACGAGCTC CTCamSOQC TOGACIOGCT CGCOOGGRC 25440 

AAGCCOGCCC OGAGCaOXST OCTOGGAOGG AGOGGAAQOC AOGGCaAQCT CXSICTTOQTC 25500 

TTTOCTQQQC AAeOCTCQCA GTGGGAAGQG AIQQCrCTCT OXnXXnXXA CTOCTOSOOC 25560 

GTCTTCCGCX5 CTCAGCTCGA AGCATCCGAG OSOGOQCTCG CTCCTCACX3T OGAGTOGAGC 25620 

CraCTOGOCG TCCTGCQCaS CGACGAGGGC GCCOCXnm: TCGAOCQOGT OGAOGTCGIA 25680 

CAGCX30Q0CX: TCTTTGCX3GT CftlGGICTOC CTGGOQGCXX: TCTGGOQCTC GCTCX3G0GTA 25740 

GRGCCC3Q0CG OOG!rOGTOGG OCACAGTCAG GQOGAGATOG CX3G00QCXOT OGIOGCAQGC 25800 

GCTCTCroOC TOGAQGAOGC GG0CX3GCaTC GC0GCXX3GC GCASCAAAGC GCTCROaOC 25860 

GT0GCXX3GCA AOGGQQOCaT GG0CX3C0GTC GAGCTCGGCG CXTTCCXSACCT CCAGACCTAC 25920 

CTCGCTCCCT GGGGCGACAG GCTCTCCMC GOCQCCGTCA ACASCCCCAG GGCCAOSCTC 25980 

CTGTCOGQCG AQC0CGCCX3C CaiOGAOGOG CTGAIOmCT OQCICftCOGC AQCGCAGGTC 26040 

TTOGCOOGAA AAGTOOGCGT 0GACIA0GC3C TOCX^CICXXj OCXaGMKSGA OGOOGTCCAA 26100 

GAOGAQCTOG CXX3CAGGTCT AGOCAACRTC QCTOCTOSV CGIGOGAQCT CXX3CTTTAT 26160 

TOGACOGTCA COQGCACCAS QCTaSROSGC TOOSAGCTCSG AOGGOGOGTA CTGGIATOGA 26220 

AACCTCCGGC AAACCGTCCT GTTCTCGAGC GCGACCGAGC GQCTOCTCGA aSMGQGCAT 26280 

CGcrrcrras togaggtgag ooccckjxxx: gtqctcaoqc tcgccctoos aaoAOCTGC 26340 

GAGOGCTCAC CGCIOSAZrOC OGTOGIOGIC GGCTCXamC GAaSOaOSA AQQOCACCTC 26400 

GCX30G0CrGC TCXHCTOCTG QGOGGAQCTC TCEA0CX3GAG QCCTOQOQCT CXaACIOaAC 26460 

GCCTTCTTOG OGCCXOTOGC TOCXXXXAAG GTCKXrTOC OCaOCTAOCX: CTT0CAAC3QC 26520 

GAGCQCTTCT GGCTOGAOGC CTCCAOGGCG CAOGCTGCCG ACGTCGOCTC OGCAGGCCTG 26580 

ACCTCGGCCG ACCACCCGCT GCTOQGOGOC GCXXSTOSCCC T0G006ACOG OSAIXXSCTTT 26640 

GTCTTCACAG GAOGGCICIC OCTOGCAGAG CAOCX^GIGGC TOGAAGAOCA OGICGICTIC 26700 

GGCATAOXIT GTCXnXSOCAG GOQCOGOCTC CTOaGCTaS CXCTGCaaXST OGCOCASCTC 26760 

GICGGCCrCG ACACCGIOSA AGAOSTCAOG CICGRCOCCC OCCTOGCTCT CXXMOGCAG 26820 

GGOSCXXSTOC TCXTEOCAGAT CTCJOGTaSQG OOOGOGGAOG GTGCTGGAOG AAQGGOQCTC 26880 

TCCGTTCATA GCCGGOGCCA OGAOGOQCIT CAGGAIGGCC CCTGGACTOG OCAOGCCAQC 26940 

GGCTCTCTOG CXX»AGCTAG CmSTCXXM' TGOCTTOGAT GCTOOQOGAA TQQOOOOOCX: 27000 
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CAOOCAAGGT TTCIAOGCAG OCXTECGAGAG CGOrGGGCn 27060 

GCTTATGGCC C0GAGTTC3CA GGGCTTCOGC OGCCCTCTAC AAGCGCGGCG AOGAGCICTT 27120 

CGOCXSAAGCC AAGCTOOOGG AOGOCGOOGA ASAGGACX^OC GC1XX3TTTTG COdCCAOOC 27180 

CGOOCTGCrc (3VCAG0G0CT TGCAGGOOCT OdOCTrTGIA GAOGACXAGG CAAAGGCXHT 27240 

CAGGA3G00C TTCTOGIGGA GOQGAGIMC QCTGOQCTOC GGTOQGRQCC AOCAOCCTGC 27300 

GCGTGOGTTT CCAOOGTCCT GAQGOaaM OCTOQCQCTC GCTOCTOCTC GCXXaCGOCA 27360 

GAGQCX3AACC CaTOQOCTCG GTGCAAGCXX: TCCOCaTGOG C3QCX3GOG?rCX: QOCGAGCaGC 27420 

TCCGCAGACC CGGGAGCGTC CCAOCTOGAIX GOCCTCTTCC GCftTOGaCTG GRGOGAQCIG 27480 

CAAAGCOCCA CCICAOCGOZ CATOGOCCCG AGCGGTGCCC TOCTCQQC3\C iyGMOGTCIC 27540 

GACCTOGGQV CXIIAGGGIGOC TCIGGAOOGC TAXAOOGAOC TTOCIGCICT AOGCAGOGOC 27600 

CrCGAOCAGG QOGCTTOQCC TCCaAGCXJEC CTCSOOSCCC CCTTCAKSC TCTGCXXSAA 27660 

GQOaOCTCA TOGCGAQOQC OOQOGaGADC A0CX30GCA0G CXX^XOSOCCT CTTQCAAGCX: 27720 

TGGCTCG0CX3 AOGAGCQCCT CX3CCTOCICX3 OQCXTTOGOCX: TOCTCAOOCG A0G0GCCX3TC 27780 

GCCaCCXaOG CTGAAGEAAGA OGICAAGQGC CTOQCICACG OQCXrECTCax; GGGTCIOQCT 27840 

CGCTCCGOGC AGAGCXSAOC?^ 0CX3VGAG0GC CXmCIOGTOC TCGTOGACXn! OGAOGaCAGC 27900 

GAGGOCTOCX: AGCAOGCCTT GCICGGOGOG CIOGAOGCAA GAGAGOCAQV GATCGCXXniC 27960 

CGCAAOGGCA AAOCXX:iCGT 70CAA0GCIC TCAOGOCIGC CXTAGGOGCX: CAOGGACAOV 28020 

G0GTCXXXXX3 CAGGCXTTOGG AGGCAOOGIC CTCAICAOGG GAGGCAOOGG CAOGCICGGC 28080 

GCrCTGGTCG OGCGOCX3C3CT OGTCGTAAAC CAOGACGCCA AQCACCTGCT CdCMX^ 28140 

CGCCAGGGCG OGAGOGCICC GGGIGCTGAT GICTZGOGAA GOGAGCIOGA AGCICIGGGG 28200 

GCTTOGGTCA COCTCGOOGC GIGOGAOSIG GOOGA!ECX»C GOGCICISAAA GGAOCTTCEG 28260 

GA!I!AACATEC OO^GCGCICA OOOGGICGOC GCXX3ICGTGC AZGCXX30CAG OGIOCIOGAC 28320 

GGCGATCTGC TOGGCGOCAT GAGCXnOGAG 0GGA3!CGA0C GOGICTTOGC OCXTAAGATC 28380 

GATGCXX3CCT GGCACTTGCA TCAGCTCACC CAAGAIAAGC CXXOTGCOGC CTTCAIOCTC 28440 

TTCrCGTCCG TOGOCGGOGT OCTOGGCAGC TCAGGICACT CXIAACXAOSC OGCIGOGAGC 28500 

GCCTTOCIOG ATGOGCTTGC GCAOCAOOGG OGOGOGCAAG GGCIOCXrCGC CIOVITOGCIC 28560 

GCX?TGGAGCX: ACIGGGCXX3V GOSCAGOGCA ATGACAGAGC ACGTCAGOGC OGOOGGOGTC 28620 
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CCTCGCATGG AQCGCX3CCGG OCTTOCCTCG ACCTCTGftfiG AGftQQCTOGC OCTCTrOaKr 28680 

GCGGOGCTCT TCX»3AMCGA GACOGCXXTTG GTCXXX3Q0QC GCTTOGACTT GavSOGOQCTC 28740 

AQGQCX3AA0G OOQQCAQOGT CCXmCTTG TOTAAOGTC TOGOXXXXXSC TOGCAOOGTA 28800 

OQCAftGQOCG CCAGCAACAC CX3CXXaGGCC TOGTCGCm CftGRQOQOCT CTCAQC30CTC 28860 

OOGCCX3GCX3G AAOQOCaGOG TGCXXTCGCTC QVICTCRTCX: QCftOOGAAGC CGCCGOOSJX: 28920 

CTCGGCCTCG CXTTCCTTCGA imX^CTCGAT CCOGATOG 28958 
(2) INEORManaJ for SEQ id NO: 7: 

(i) SEQDENCE CffltfUOraiSTICS : 

(A) lENGXH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS: single 

(D) TQPOIOCT: linear 

(ii) MDLECOLE TYPE: other nucleic acid 

(iii) HYPOTHETICAIi: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATOBE: 

(A) NAME/KEY: inisc_feature 

(B) lOCanON: 1..13 

(D) OTHER INFOE^MATION: /noteB "'sequence of a plant 
consensus translation initiator (Clontech)** 

(xi) SEQUENCE DESCRIPTiaH: SEQ ID NO: 7: 
GTaaROCKEG GIC 13 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQDENCE CHABACIEEaSTICS : 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STBANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOI£Cai£ TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SEMSE: NO 



(ix) FEIATDRE: 

(A) NAME/iOEY: misc_feature 
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(B) IXXaJ IQN; 1..12 

(D) OTHER INEX)BMAT10N: /note" "sequence of a plant 
consensus translation initiator (Joshi)" 



<xi) SEQOENCB DESCHIPTIC^: SEQ ID NO: 8: 
TAAACAAIIGG CT 12 
(2) INFORMATICS FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTEEUSTICS : 

(A) lENSIH: 22 base pairs 

(B) TYBB: nucleic acid 

(C) STRANDEDNESS: single 

(D) TC»>0IOCT: linear 

(ii) MOLECOIE TYPE: other nucleic acid 
(iii) HXPOTHETICAli: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATORE: 

(A) NAME/KEY: misc^feature 

(B) lOCAT ICaJ: 1..22 

(D) OTHER INFORMAriON: /note« "sequence of an 

oligonucleotide for use in a molecular adaptor" 

(xi) SBCKIENCE DESCRIPTIGN: SEQ ID NO: 9: 
AATTCIAAAG CSVIGCOGATC GG 22 
(2) INFOKMAnON FOR SEQ ID NO: 10: 

(i) SEQOENCE CHAE^ACTEEOSTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIOGY: linear 

(ii) MOTiTOTIE TYPE: other nucleic acid 
(iii) HYPCTTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATDRE: 

(A) NAME/KEY: znisc_feature 

(B) LOCATION: 1..21 

(D) OTHER IKraE^MATION: /note= "sequence of an 

oligonucleotide for use in a mole cu lar adaptor" 
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(xi) SEQOENCE ISCSCRIFnON: SEQ ID NO: 10: 



AATTCOGATC GGCA3GCTTT A 



21 



(2) INFOBMATIQN FOR SEQ ID MO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TQE^OLOGy: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCAnON: 1..22 

(D) OTHER INFORMATION: /note-= "sequence of an 



oligonucleotide for use in a nolecular adaptor** 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGTB: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOIECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..21 

(D) OTHER INFORMATION: /note« "sequence of an 



oligonucleotide for use in a xtolecular adaptor** 



(xi) 



SEQUENCE DESCRIPT1X)N: SEQ ID NO: U: 



AATTCTAAAC CATGGOGATC GG 



22 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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AATTCCGATC GCCATQGTTT A 



21 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SE3Q0ENCE CHARACTERISTICS: 

(A) I£NGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLBCOI£ TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) EraTDRE: 

(A) NAME/KEY: misc_feature 

(B) LOCAn ON: 1..15 

(D) OTHER INFORMATION: /nat&F "sequence of an 



oligonucleotide for use in a nolecular adaptor" 



(2) IKEtXBMATICN FOR SEQ ID NO: 14: 

(i) SEQPENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOI£Cai£ TYPE: other nucleic acid 
(dLii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: inisc_feature 

(B) LOCATION: 1..19 

(D) OTHER INFORMATICS: /note» "sequence of an 



oligonucleotide for use in a molecular adaptor" 



(xi) 



SEQDENCE DESCRIPTION: SEQ ID NO: 13: 



CCAGCTGGAA TICCG 



15 



(xi) 



SEQDENCE DESCRIPTION: SEQ ID NO: 14: 



OGGAATTCCA GCTGGCATG 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCAT ION: 1..11 

(D) OTHER INFORMRTION: /note« "oligonucleotide used to 
introduce base change into SphI site of ORFl of 
pyrrolnitrin gene cluster" 



(xi) SEOOENCE DESCEOtPTION: SEQ ID NO: 15: 
CCCCCTCATG C 11 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTESaSTICS : 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICaL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..11 

(D) OTHER INFORMATION: /note= "oligonucleotide used to 
introduce base change into SphI site of QRFl of 
pyrrolnitrin gene cluster" 



(xi) SEQUENCE DESC31IPTIGN: SEQ ID NO: 16: 

GCATGAGGGG G 

(2) INFORMATION FOR SEQ ID NO: 17: 



11 
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(i) SESQOENCE CHftRACTERTSTICS ; 

(A) ISNGTH: 4603 base pairs 

(B) TYPE: nucleic acid 

(C) STRAl^EDblESS: single 

(D) TOPOLOGY: linear 

(ii) MDI£COZ£ TYPE: DNA (genonic) 
(iii) HYPOTHETTICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) lOCam ON: 230.. 1597 

(D) OTHER INPORMATICN: /gene" "phzl" 
/labels ORPl 

(ix) FEATURE: 

(A) NAbE/KEY: CDS 

(B) LOCATION: 1598.. 2761 

(D) OTHER INEX][RMATIQN: /gene" "phz2" 
/labels GRF2 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATE: 2764.. 3600 

(D) OTHER INraRMAnON: /gene" ••jdizB'* 

/label= 0RF3 

(ix) ii:ATQRE: 

(A) NAME/KEY: inisc_£eature 

(B) LOCATION: 3597.. 4265 

(D) OTHER INFORMATION: /label- QRF4 



(xL) SBQPENCE DESCRIPTION: SEQ ID NO: 17: 



GCAaX3C0QIG ACCTCX33CXX3 GTGGCX3IGGC 0G00GG0CI6 


CAOCIGSy^ OCAOOOOIGA 


60 


OGACGICAGC GftGTGOGCTT OOSn^GOOGC C0GCXnX3C»T 


CAGGZOGOCA GOOGCIACAA 


120 


AAGOCIGIGC GftOOOGCGOC TGAAObOCIG GCflAGOCan 


AdGQGQIGA TGGOCIGSftA 


180 


AAACCAGOCC TCTICAAOCX: TIGOCICCIT TIGACIGGAG 


'mxjxoiix: aig aoc 

Met "Oa: 
1 


235 


GGC ATI OCA TOG AIC GIC OOT TAC GOO TSG OSS 
Gly lie Pro Ser lie Val Pzo Tyr Ala Leu Pso 
5 10 


AOO AAC OGC GAC CXG 
Thr Asn Azg A^ Lea 
15 


263 


COC GIG AAC CIC G06 CAA TGG AGO AIC (30 000 


GAG OGT GOO GIG CFG 


331 
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Pro Val Asn Leu Ala Gin Txp Ser He Asp Pro Glu Arg Ala Val Leu 
20 25 30 



CTG GIG CAT G?^C ATG CAG OGC TAG TTCCrGCGGOCCTOXSCCCGACGCX: 
Leu Val His Asp Met Gin Arg Tyr Phe Leu Arg Pro Leu Pro Asp Ala 
35 40 45 50 



379 



CTGCGTGACGAAGTCGTGAGCAATGCCGOGOGCATTOGC 
Leu Arg Asp Glu Val Val Ser Asn Ala Ala Arg lie Arg Gin Trp Ala 
55 60 65 



427 



GCC GAC AAC GGC GTT CCG GTG GCC TAG ACC GCC CAG CCC GGC AGC ATC 
Ala Asp Asn Gly Val Pro Val Ala Tyr !Ehr Ala Gin Pro Gly Ser Met 
70 75 80 

AGC GAG GAG CAA CGC GGG CTG CTC AAG GAC TTC TGG GGC CCG GGC ATG 
Ser Glu Glu Gin Arg Gly Leu Leu Lys Asp Phe Trp Gly Pro Gly Met 
85 90 95 



TCC GAC CTG CTG GAA OGC ATG OGC GOO AAC GGG OGC GAT CAG TTG ATC 
Ser Asp Leu Leu Glu Arg Met Arg Ala Asn Gly Arg Asp Gin Leu lie 
135 140 145 

CTGTGCGGGGTGTACGCCCATGTCGQGGTACTGATTTCCACCGTGGAT 
Leu Cys Gly Val Tyr Ala His Val Gly Val Leu He Ser Thr Val Asp 
150 155 160 



475 



523 



AAG GCC AGC CCC GCC GAC CGC GAG GTG GTC GGC GCC CTG AOG CCC AAG 571 
Lys Ala Ser Pro Ala Asp Arg Glu Val Val Gly Ala Leu Thr Pro Lys 

100 105 no 

COC GGC GAC TGG GIG CTG ACC AAG TGG OGC TAC AGC GOG TTC TTC AAC 619 
Pro Gly A^ Tip Leu Leu Thr Lys Trp Arg Tyr Ser Ala Phe Phe Asn 
115 120 125 130 



667 



715 



GOCTACTCOAACGATATCCAG0OGTTCCTCGTTGCC6ACGOGATCG00 
Ala Tyr Ser Asn Asp lie Gin Pro Phe Leu Val Ala Asp Ala He Ala 
165 170 175 



763 



GAC TTC AGC AAA GAG CAC CAC TGG ATGCCATCGAATAOGOOGOCAGOC 
Asp Phe Ser Lys Glu His His Trp Met Pro Ser Asn Thr Pro Pro Ala 
180 185 190 



811 



GTT GCG CCA TGT CAT CAC CAC CGA OGA GGT GGT GOT ATG AGC CAG ACC 
Val Ala Pro Cys His His His Arg Arg Gly Gly Ala Met Ser Gin Thr 
195 200 205 210 



859 



GCAGCCCACCTCATGGAACGCATCCTGCAACCGGCTCCCGAGCCGTTT 
Ala Ala His Leu Met Glu Arg He Leu Gin Pro Ala Pro Glu Pro Phe 
215 220 225 



907 



GCCCTGTTGTACCGCCCGGAATCCAGTGGCCCCGGCCrcC^ 
Ala Leu Leu Tyr Arg Pro Glu Ser Ser Gly Pro Gly Leu Leu Asp Val 
230 235 240 



955 
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ClXi 2^ GGC G^A MG TCG CAA OOS Ct^ 1003 
Leu lie Gly Glu Met Ser Glu Pro Gin Val Leu Ala Asp lie Asp Leu 
245 250 255 

CCTGCCACCTCGAaX^GGCGOGCCTOSCCTO^ 1051 
Pro Ala Thr Ser He Gly Ala Pro Arg Leu Asp Val Leu Ala Leu He 
260 265 270 

CCC TAC OGC CAS Arc GOC GAA OGCGGTTrCGAGGCGGTGQ^GATGAG 1099 
Pro Tyr Arg Gin He Ala Glu Arg Gly Phe Glu Ala Val Asp Asp Glu 
275 280 285 290 

TCX30CGCTGCIGGa3A3X3AftCATCACCGAGCAGCA^ 1147 
Ser Pro Leu Leu Ala Met Asn He Thr Glu Gin Gin Ser He Ser He 
295 300 305 

GAG CGC TTG CTG GGA AIG CTXSOOCAACGIGCXXSATCCAGTTGAaCAGC U95 
Glu Arg Leu Leu Gly Met Leu Pro Asn Val Pro He Gin Leu Asn Ser 
310 315 320 

GAA OGC TTC GAC Crc AGC GAC GCX3 AGC TAC GCX: GAG ATC GTC AGC CAG 1243 
Glu Arg Phe Asp Leu Ser Asp Ala Ser Tyr Ala Glu He Val Ser Gin 
325 330 335 

GTGATCQCCAATGAAATCGGCTCCGGGGAAGGCGCC AAC TTC GTC ATC 1291 
Val He Ala Asn Glu He Gly Ser Gly Glu Gly Ala Asn Phe Val He 
340 345 350 

AAA OGC AOC TIC CIG GOC GAG A!EC AGC GAA TAC GGC COG GOC AGT GCG 1339 
Lys Arg Thr Phe Leu Ala Glu He Ser Glu Tyr Gly Pro Ala Ser Ala 
355 360 365 370 

CTGTCGTTCTTTCGCCATCIGCrG<3^CGGGAG AAA GGC GCC TAC TGG 1387 
Leu Ser Phe Phe Arg His Leu Leu Glu Arg Glu Lys Gly Ala Tyr Tzp 
375 380 385 

ACGTTCATCATCCACACCGGCAQCOGTACCTTCGTGGCTGaSTCCCO^ 1435 
Thr Phe He He Bis Thr Gly Ser Arg Thr Phe Val Gly Ala Ser Pro 
390 395 400 

GAGCGCCACATCAGCAICAAGGSVIGGGCTCTOGGIGATG;^ 1483 
Glu Arg His He Ser He Lys Asp Gly Leu Ser Val Met Asn Pro He 
405 410 415 

AGCGGCACTTACCGCTATOCGCCCGCCGGCCXAACCEGTCGAAGTC 1531 
Ser Gly Thr Tyr Arg Tyr Pro Pro Ala Gly Pro Asn Leu Ser Glu Val 
420 425 430 

ATG GAG TTC CTG GCG GAT CGC AAG GAA GCC GAC GAG CTC TAC ATG GTG 1579 
Met A^ Phe Leu Ala Asp Arg Lys Glu Ala Asp Glu Leu Tyr Met Val 
435 440 445 450 

GIGGATGAAGAGCTGTAAATGATGGCGCGCATTTGTGAGGACGGCGGC 1627 
Val Asp Glu Glu Leu * Met Met Ala Arg He Cys Glu Asp Gly Gly 
455 1 5 10 
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CAC GTC CTC GGC CCr TAG CTC AAG GAA KTG GOG C3«: CTG GCXI CAC AOC 1675 
His Val Leu Gly Pro Tyr Leu Lys Glu Met Ala His Leu Ala His Thr 
15 20 25 

GAG TAC TTC ATC GAA GGC AAG ACC CATOGCGATGrAOGGGAAATCCTG 1723 
Glu lyr Phe lie Glu Gly Lys Thr His Arg Asp Val Arg Glu He Leu 
30 35 40 

CGCGAAACCCTXSaTTGCGCCCACCGTCACCGGCAG^ GAA AGC 1771 

Arg Glu Thr Leu Phe Ala Pro Thr Val Thr Gly Ser Pro Leu Glu Ser 
45 50 55 

GCC TGC OGG GTC ATC CAG CGC TAT GAN CCG GAA GGC CGC GOG TAC TAC 1819 
Ala cys Arg Val lie Gin Arg Tyr Xaa Pro Gin Gly Arg Ala Tyr Tyr 
60 65 70 

AGCGGCATGGCTGCGCTGATCGGCAGCGATGQCAAGGGCGQGCGrTCC 1867 
Ser Gly Met Ala Ala Leu He Gly Ser Asp Gly Lys Gly Gly Arg Ser 
75 80 85 90 

CTG GAC TCC GCG ATC CTG ATT OGT ACC GCC GAC ATC GAT AAC AGC GGC 1915 
Leu Asp Ser Ala He Leu He Arg Thr Ala Asp He Asp Asn Ser Gly 
95 100 105 

GAG GIG CGG ATC AGC G!EG GGC TOG AOC ATC GIG CGC CAT TCC GAC OOG 1963 
Glu Val Arg He Ser Val Gly Ser Thr He Val Arg His Ser Asp Pro 
110 115 120 

ATG ACC GAG GCT GCC GAA AGO CGGGCCAAGGCCACTGQCCTGATC AGO 2011 
Met Thr Glu Ala Ala Glu Ser Arg Ala Lys Ala Thr Gly Leu He Ser 
125 130 135 

GCA CTG AAA AAC CAG GOG CCC TOG CGC TTC GGC AAT CAC CIG CAA GTG 2059 
Ala Leu Lys Asn Gin Ala Pro Ser Arg Phe Gly Asn His Leu Gin Val 
140 145 150 

CGC GCC GCA TTG GCC AGC CGC AAT GCC TAC GTC TOG GAC TTC TGG CTG 2107 
Arg Ala Ala Leu Ala Ser Arg Asn Ala Tyr Val Ser Asp Phe Tip Leu 
155 160 165 170 

ATGGACAGCCAGCAGaSGGAGCAGATCCAGGCCGACTTCAGTGGGOGC 2155 
Met Asp Ser Gin Gin Arg Glu Gin He Gin Ala Asp Phe Ser Gly Arg 
175 180 185 

CAGGTGCTGATCGTCGACGCCGAAGACAOCTTC AOC TOG AIG ATC GCC 2203 
Gin Val Leu He Val Asp Ala Glu Asp Thr Phe Thr Ser Met He Ala 
190 195 200 

AAG CAA CTG OGG GCC CTG GGC CIG GTA GTG ACG GTG TGC AGC TTC AGC 2251 
Lys Gin Leu Arg Ala Leu Gly Leu Val Val Thr Val Cys Ser Phe Ser 
205 210 215 

GAC GAA TAC AGC TTT GAA GGC TAC GAC CTG GTC ATC ATG GGC CCC GGC 2299 
Asp Glu Tyr Ser Phe Glu Gly Tyr Asp Leu Val He Met Gly Pro Gly 
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220 225 230 

CXrOGCi^CCGAGCGAAGTCC^AACaGOOG AAA ATC AAC CAC CTC CAC 2347 
Pro Gly Asn Pro Ser Glu Val Gin Gin Pro Lys lie Asa His Leu His 
235 240 245 250 

GIGG0CA!rcCXK:TCrTTGC7rCAGCC3^C^ OCA TTC CIC GOG GIG 2395 
Val Ala He Arg Ser Leu Leu Ser Gin Gin Arg Pro Phe Leu Ala Val 
255 260 265 

TGC CTG AGO CAT CAG GIG OTG AGO CTG TGC CIG GGC CTG GMi CTG GAG 2443 
Cys Leu Ser His Gin Val Leu Ser Leu Cys Leu Gly Leu Glu Leu Gin 
270 275 280 

CGC AAA GCC ATT CCC AAC CAG GGC GTG CAA AAA CAG ATC GAC CTG TTT , 2491 
Arg Lys Ala lie Pro Asn Gin Gly Val Gin Lys Gin He Asp Leu Phe 
285 290 295 

GGCAATGrc<3^0GGGTGGGTTTCTACAACAOOTTCGCCGOCCAG 2539 
Gly Asn Val Glu Arg Val Gly Phe Tyr Asn Thr Phe Ala Ala Gin Ser 
300 305 310 

TOG AGT GAC CGC CTG GAC ATC GAC GGC ATC GGC ACC GTC GAA ATC AGO 2587 
Ser Ser Asp Arg Leu Asp lie Asp Gly lie Gly Thr Val Glu He Ser 
315 320 325 330 

CGC GAC AGO GftG ACC GGCCaGGTGCATGCCCTGOGTGQCOOCTCG TTC 2635 
Arg Asp Ser Glu Thr Gly Glu Val His Ala Leu Arg Gly Pro Ser Phe 
335 340 345 

GCC TCC ATG CAGTTTCATGCCGAGTCGCTGCTGACCCAGGAAGGTOOG 2683 
Ala Ser Met Gin Phe His Ala Glu Ser Leu Leu thr Gin Glu Gly Pro 
350 355 360 

OKrATCATCQCCGACCIGCTGOGGCACGCCCrGATCCAC ACA OCT GTC 2731 
Arg He He Ala Asp Leu Leu Arg His Ala Leu He His Thr Pro Val 
365 370 375 

GAGAACAACGCTTCGGCCGCCGGG AGA^TAA OC AIG CAC CAT TAC GTC 2778 
Glu Asn Asn Ala Ser Ala Ala Gly Arg * Met His His Tyr Val 
380 385 1 5 

ATCATCGACGCCTTTGCCAGCGTCOOGCTGGAAGGCAATCOGGTCGOG 2826 
He He Asp Ala Phe Ala Ser Val Pro Leu Glu Gly Asn Pro Val Ala 
10 15 20 

GTGTTCTTTGACGCCGATGACTTGTCGGCCGAG CAA ATG CAA CGC ATT 2874 
Val Phe Phe Asp Ala Asp Asp Leu Ser Ala Glu Gin Met Gin Arg He 
25 30 35 

GOCOGGC^ATGAACCTGTaSGAAAOCACTTTCGIGCTCAAGaSV^ 2922 
Ala Arg Glu Met Asn Leu Ser Glu Thr Thr Phe Val Leu Lys Pro Arg 
40 45 50 

AACTGCGGCQCTGCGCTGATCCGGATCTTC ACC COG GTC AAC GAA CTG 2970 
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Asn Cys Gly Asp Ala Leu He Arg He Phe Thr Pro Val Asn Glu Leu 
55 60 65 



CXXTTCGCCGGGCACCX^GTTGCTXSQQCACGGACiOT 
Pix) Phe Ala Gly His Pro Leu . Leu Gly Ttir Asp lie Ala Leu Gly Ala 
70 75 80 85 



3018 



CGC ACC GAC AAT CACOGGCTGTTCCIGGAAACCCASATGGGCACCATC 
Arg Thr Asp Asn His Arg Leu Phe Leu Glu Thr Gin Met Gly Thr lie 
90 95 100 



3066 



GCC TTT GAG CTG GAG CGC CAG AAC GGC AGC GTC ATC GCC GCC AGC ATG 
Ala Phe Glu Leu Glu Arg Gin T^n Gly Ser Val He Ala Ala Ser Met 
105 110 115 



3114 



GAC CAG OCX? AXA OOG AOC TGGACGGOCCIGGGGOGCGACGOCGAGTTG 
Asp Gin Pro He Pro Thr Tzp Thr Ala Leu Gly Arg Asp Ala Glu Leu 
120 125 130 



3162 



CTC AAG GCC CTG GGC ATC AQCGACTCGACCTTTOCCATCGAGATCTAT 
Leu Lys Ala Leu Gly He Ser Asp Ser Thr Phe Pro He Glu He Tyr 
135 140 145 



3210 



CAC AAC GGC CCG CGT CAT GTG TTT GTC GGC CTG CCA AGC ATC GCC GOG 
His Asn Gly Pro Arg His Val Phe Val Gly Leu Pro Ser He Ala Ala 
150 155 160 165 



3258 



CTG TCG GCC CTG CAC 000 GAC CAC OGT GCC CTG TAC AGO TTC CAC GAC 
Leu Ser Ala Leu His Pro Asp His Arg Ala Leu Tyr Ser Phe His Asp 
170 175 180 



3306 



ATG GCC ATC AAC TGT TTT GCC GGT GOG GGA COG CGC TGG OQC AGC OGG 
Met Ala He Asn C^s Phe Ala Gly Ala Gly Arg Arg Trp Arg Ser Arg 
185 190 195 



3354 



ATGTTCTCGCCGGCCTATGGGGTGGTCGAGGATGOGNOCAOSGGCT^ 
Met Phe Ser Pro Ala Tyr Gly Val Val Glu Asp Ala Xaa Thr Gly Ser 
200 205 210 



3402 



GOTGCCGGGCCCTTGGCGATCCATCTGGCGCGGCATGGCCAGA!rcGAG 
Ala Ala Gly Pro Leu Ala He His Leu Ala Arg His Gly Gin He Glu 
215 220 225 



3450 



TTC GGC CAG CAG ATC GAA ATTCTTCAGGGCGTGGAAATCGGCCGCCCC 
Phe Gly Gin Gin He Glu He Leu Gin Gly Val Glu He Gly Arg Pro 
230 235 240 245 



3498 



TCACTCATGTTCGOCCQGGCOGAGGGCOGCGOCGATCAACTGAOGOGG 3546 
Ser Leu Met Phe Ala Arg Ala Glu Gly Arg Ala Asp Gin Leu Thr Arg 
250 255 260 

GTCGAAGTATCAGGCAATGGCATCACCTTCGGACGGGGGAOCAICGTT 3594 
Val Glu Val Ser Gly Asn Gly He Thr Phe Gly Arg Gly Thr He Val 
265 270 275 
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CTA TGA ACAGTrCAGT ACTAGGCAAG COGCTGTTGG GTAAAGGCAT GTGGGAATOG 3650 
Leu * 





CACIGfflTGC 


GOCGTTCCCC 
OGAAOGOGCA 


GAGIAOCAGA AOOOGOCTGC CGKTCCCKrG 


3710 


AGOQIGCIGC 


ACAACIGGCT 


CGCCGCGTQQ GCA!ICOGOSA AOOOOGIGOG 


3770 


CTGGOGCTGG 


OCAOGGCTGA 


CAGCXIAOGGC 


OGGCXTTTOGA CAa3CA!rCGT GGTGKrCRSI 


3830 


G?^GATCAGTG 


ACACCGGGGT 


GCTGTTCAGC 


AiXCATOCCG GAAGOCAGAA AGGOOGOGAA 


3890 


CTGACAGAG^ 




CI0GGGGA06 


CIGTATTGGC GOGAAAOCAG 0CAGCA6ATC 


3950 


AICCTCAAirG 


GOCAGGOOST 


GCX;CAXGCX^ 


GA3IGCCAAGG CIGAOGAGGC CIGGTTGAAG 


4010 


CGcacmoG 


CXAOGCATCC 


GAIGICAl^ 


GIGICIOGOC AGAGIGAAGA ACICAAGGAT 


4070 


GTTCAAG0C2^ 


TGOGCAAOGC 


OGOCAGGGAA 


CTGGCX3GAG6 TTCAAGGICC GCTGCXX3CGT 


4130 


CXXS^GGGTT 


ATTGOGTGTT 


TGAGTTACGG 


CTTGAA!TOQC TGGAGTTCTG GGGTAACGGC 


4190 


GAGGAGOGCX: 


OIGCATGftAOG 


CTTGCGCTAT. 


GAC0GCAG06 CIGAAGGCIG. GM>iRCK£CGC 


4250 


OGGXTACAGC 


CATAGGGTOC 


0GCX3AXAAAC 


KIGClTrGMi GIGXIGGCT GCICCAGCTT 


4310 


OGAACTCATT 


GOGCAAACin! 


CAACACTTAT 


GACAOOOGGT CAACATGAS^ AAAGTOCAGA 


4370 


TGOGAAAGAA 


OGOGTATICG 


AAA!CAOCAAA 


CAGAGAGICC GCSm^OCAA AGIGIGIAAC 


4430 


GACATTAACX 


CCTA3CTGAA 


TTTTATACTT 


GCTCTAGAAC GTTGTCCTTG ACCCAGCGAT 


4490 


AGACATCGGG 


CCAGAACCIA 


CAXAAACAAA 


GTCAGACATT ACIGAGGCIG CIACCAaX3CT 


4550 


AGATTTTCAA 


AACAAOOGTA 


AAXAICIGAA 


AASIGCAGAAi TCCITCAAAG CTT 


4603 



(2) INFOR^^ATION FOR SEQ ID MO: 18: 

(i) SEQDErCB CBARACXEEaSTICS: 

(A) I£K(^: 456 amino adds 

(B) TYPE: amino acid 
(D) TOPOIOGy: linear 

(ii) M3IECGIE TYPE: protein 

(xi) SBQDENCE DESCRIPTION: SEQ ID MO: 18: 

Met Thr Gly lie Pro Ser He Val Pro Tyr TOa Leu Pro Thr Asn Arg 
15 10 15 

Asp Leu Pro Val Asn Leu Ala Gin Tzp Ser lie A^ Pro Glu Arg Ala 
20 25 30 

Val Leu Leu Val Bis Asp Met Gin Arg Tyr Phe Leu Arg Pro Leu Pro 
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35 



40 



45 



Asp Ala Leu Arg Asp Glu Val Val Ser Asn Ala Ala Axg lie Azg Gin 
50 55 60 

Tip Ala Ala Asp Asn Gly Val Pro Val Ala Tyr Zhr Ala Gin Pro Gly 
65 70 75 80 

Ser Met Ser Glu Glu Gin Arg Gly Leu Leu Lys Asp Phe Trp Gly Pro 
85 90 95 

Gly Met Lys Ala Ser Pro Ala Asp Arg Glu Val Val Gly Ala Leu Thr 
100 105 110 

Pro Lys Pro Gly Asp 1^ Lep Leu Thr lys 'Lrp Arg Tyr Ser Ala Phe 
115 120 125 

Phe Asn Sez Asp Leu Leu Glu Arg Met Arg Ala Asn Gly Arg Asp Gin 
130 135 140 

Leu He Leu Cys Gly Val Tyr Ala His Val Gly Val Leu He Ser Thr 
145 150 155 160 

Val Asp Ala Tyr Ser Asn Asp He Gin Pro Phe Leu Val Ala Asp Ala 



He Ala Asp Phe Ser Lys Glu His His Trp Met Pro Ser Asn Thr Pro 
180 185 190 

Pro Ala Val Ala Pro Cys His His His Arg Arg Gly Gly Ala Met Ser 
195 200 205 

Gin Thr Ala Ala His Leu Met Glu Arg He Leu Gin Pro Ala Pro Glu 
210 215 220 

Pro Phe Ala Leu Lea Tyr Arg Pro Glu Ser Ser Gly Pro Gly Leu Leu 
225 230 235 240 

Asp Val Leu He Gly Glu Met Ser Glu Pro Gin Val Leu Ala Asp He 
245 250 255 

Asp Leu Pro Ala Thr Ser He Gly Ala Pro Arg Leu Asp Val Leu Ala 
260 265 270 

Leu He Pro Tyr Arg Gin He Ala Glu Arg Gly Phe Glu Ala Val Asp 
275 280 285 

A^ Glu Ser Pro Leu Leu Ala Met Asn He Thr Glu Gin Gin Ser He 
290 295 300 

Ser He Glu Arg Leu Leu Gly Met Leu Pro Asn Val Pro He Gin Leu 
305 310 315 320 

Asn Ser Glu Arg Phe A^ Leu Ser Asp Ala Ser Tyr Ala Glu He Val 



165 



170 



175 



325 



330 



335 



wo 95/33818 



PCT/IB95/00414 



.171 



Ser Gin Val lie Ala hsn Glu lie Gly Ser Gly Glu Gly Ala Asn Phe 
340 345 350 

Val He Lys Arg Thr Phe Leu Ala Glu He Ser Glu Tyr Gly Bxo Ala 
355 360 365 

Ser Ala Leu Ser Phe Phe Arg His Leu Leu Glu Arg Glu Lys Gly Ala 
370 375 380 

Tyr Trp Thr Phe He He His Thr Gly Ser Arg Thr Phe Val Gly Ala 
385 390 395 .400 

Ser Pro Glu Arg His He Ser He Lys Asp Gly Leu Ser Val Met Asn 
405 410 415 

Pro He Ser Gly Thr T^r Arg Tyr Pro Pro Ala Gly Pro Asn Leu Ser 
420 425 430 

Glu Val Met Asp Phe Leu Ala Asp Arg Lys Glu Ala 7^ Glu Leu Tyr 
435 440 445 

Met Val Val Asp Glu Glu Leu * 
450 455 



(2) INFOBMATIC^ FOR SBQ ID ND: 19: 

(i) SEQPENCE CHARACTERISTICS: 

(A) LEl^H: 388 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGSr: linear 

(ii) MOLECOLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Met Ala Arg He Cys Glu Asp Gly Gly His Val Leu Gly Pro Tyr 
1 5 10 15 

Leu Lys Glu Met Ala His Leu Ala His Thr Glu Tyr Phe He Glu Gly 
20 25 30 

Lys Thr His Arg Asp Val Arg Glu He Leu Arg Glu Thr Leu Phe Ala 
35 40 45 

Pro Thr Val Thr Gly Ser Pro Leu Glu Ser Ala Cys Arg Val He Gin 
50 55 60 

Arg Tyr Xaa Pro Gin Gly Arg Ala Tyr Tyr Ser Gly Met Ala Ala Leu 
65 70 75 80 

He Gly Ser Asp Gly Lys Gly Gly Arg Ser Leu Asp Ser Ala He Leu 
85 90 95 
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lie Arg Thr Ala Asp lie Asp Asn Ser Gly Glu Val Azg lie Ser Val 
100 105 110 

Gly Ser Thr He Val Arg His Ser Asp Pro Met Thr Glu Ala Ala Glu 
115 120 125 

Ser Arg Ala Lys Ala Thr Gly Leu lie Ser Ala Leu Lys Asn Gin Ala 
130 135 140 

Pro Ser Arg Phe Gly Asn His Leu Gin Val Arg Ala Ala Leu Ala Ser 
145 150 155 160 

Arg Asn Ala Tyr Val Ser Asp Phe Tip Leu Met Asp Ser Gin Gin Arg 
165 170 175 

Glu Gin He Gin Ala Asp Phe Ser Gly Arg Gin Val Leu He Val Asp 
180 185 190 

Ala Glu Asp Thr Phe Thr Ser Met He Ala Lys Gin Leu Arg Ala Leu 
195 200 205 

Gly Leu Val Val Thr Val Qys Ser Phe Ser Asp Glu Tyr Ser Phe Glu 
210 215 220 

Gly Tyr Asp Leu Val He Met Gly Pro Gly Pro Gly Asn Pro Ser Glu 
225 230 235 240 

Val Gin Gin Pro Lys He Asn His Leu His Val Ala He Arg Ser Leu 
245 250 255 

Leu Ser Gin Gin Arg Pro Phe Leu Ala Val Cys Leu Ser His Gin Val 
260 265 270 

Leu Ser Leu Cys Leu Gly Leu Glu Leu Gin Arg Lys Ala He Pro Asn 
275 280 285 

Gin Gly Val Gin Lys Gin He Asp. Leu Phe Gly Asn Val Glu Arg Val 
290 295 300 

Gly Phe Tyr Asn Thr Phe Ala Ala Gin Ser Ser Ser Asp Arg Leu Asp 
305 310 315 320 

He Asp Gly He Gly Thr Val Glu He Ser Arg Asp Ser. Glu Thr Gly 
325 330 335 

Glu Val His Ala Leu Arg Gly Pro Ser Phe Ala Ser Met Gin Phe His 
340 345 350 

Ala Glu Ser Leu Leu Thr Gin Glu Gly Pro Arg He He Ala Asp Leu 
355 360 365 

Leu Arg His Ala Leu He His Thr Pro Val Glu Asn Asn Ala Ser Ala 
370 375 380 



Ala Gly Arg * 
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385 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SBQOENCB GHABAdEEOSTICS : 

(A) LENGTH: 279 amino acids 

(B) TYPE: amino acid 
(D) TQPOIOGY: linear 

(ii) MDZECOLE TYPE: protein . 

(xi) SBQDENCE DESCRIPTION: SBQ ID NO: 20: 

Met His His Tyr Val He He Asp Ala Phe Ala Ser Val Pro Leu Glu 
1.5 10 15 

Gly Asn Pro Val Ala Val Phe Phe Asp Ala Asp Asp Leu Ser Ala 
20 25 30 

Gin Met Gin Arg He Ala Arg Glu Met Asn Leu Ser Glu Thr Thr Phe 
35 40 45 

Val Leu Lys Pro Arg Asn Cys Gly Asp Ala Leu lie Arg He Phe Thr 
50 .55 60 

Pro Val Ash Glu Leu Pro Phe Ala Gly His Pro Leu Leu Gly Thr Asp 
65 70 75 80 

He Ala Leu Gly Ala Arg Thr Asp Asn His Arg Leu Phe Leu Glu Thr 
85 90 95 

Gin Met Gly Thr He Ala Phe Glu Leu Glu Arg Gin Asn Gly Ser Val 
100 105 UO 

He Ala Ala Ser Met Asp Gin Pro He Pro Thr Txp Thr Ala Leu Gly 
115 120 125 

Arg Asp Ala Glu Leu Leu Lys Ala Leu Gly He Ser Asp Ser Thr Phe 
130 135 140 

Pro He Glu He Tyr His Asn Gly Pro Arg His Val Phe Val Gly Leu 
145 150 155 160 

Pro Ser He Ala Ala Leu Ser Ala Leu His Pro Asp His Arg Ala Leu 
165 170 175 

Tyr Ser Phe His Asp Met Ala He Asn Cys Phe Ala Gly Ala Gly Arg 
180 185 190 

Arg Trp Arg Ser Arg Met Phe Ser Pro Ala Tyr Gly Val Val Glu Asp 
195 200 ^ 205 

Ala Xaa Thr Gly Ser Ala Ala Gly Pro Leu Ala He His Leu Ala Arg 
210 215 220 
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His Gly Gin He Glu Phe Gly Gin Gin He Glu He Leu Gin Gly Val 
225 230 235 240 

Glu He Gly Arg Pro Ser Leu Met Phe Ma Arg Ala Glu Gly Arg Ala 
245 250 255 

Asp Gin Leu Thr Arg Val Glu Val Ser Gly Asn Gly He Thr Phe Gly 
260 265 270 

Arg Gly Thr He Val Leu * 
275 



(2) INFORMAnON FOR SEQ ID NO: 21: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MDLBCOLE TYPE: DNA (gencmic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATDRE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..669 

(D) OTHER INFORMATION: /gene- •^hz4" 
/label« 0RF4 

/note== "This DNA sequence is repeated from SBQ ID 
N0:17 so that the overlapping 0RF4 nay be 
separately translated" 



(xi) SEQUENCE DESCRIPTIC^: SEQ ID NO: 21: 

ATG AAC ACT TCA GTA CIA GGC AAG OCG CTG TTG GCT AAA GGC ATG TCG 48 
^t Asn Ser Ser Val Leu Gly Lys Pro Leu Leu Gly Lys Gly Met Ser 
15 10 15 

GAA TCG CTG ACC GGC ACA CIGGATOCGOOGTTCCCCGAGTACCAGAAG 96 
Glu Ser Leu Thr Gly Thr Leu Asp Ala Pro Phe Pro Glu Tyr Gin Lys 
20 25 30 

CCG CCT GCC GAT CCC ATG AGCGTGCTGCACAACTGGdCGAACGCGCA 144 
Pro Pro AlA Asp Pro Met Ser Val Leu His Asn Trp Leu Glu Arg Ala 
35 40 45 

aSCCGCGTGGGCATCCGCGAAOCCOGTGCGCTGGCGCTGGCCACGGCT 192 
Arg Arg Val Gly He Arg Glu Pro Arg Ala Leu Ala Leu Ala Thr Ala 
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50 



55 



60 



GAC AGC CAG GQC CGG OCT TCG ACA OGC ATC GTG GTG ATC ACT GAG ATC 
Asp Ser Gin Gly Arg Pro Ser Thr Arg lie Val Val He Ser Glu He 
65 70 75 80 



240 



AGTGACAOCGGGGIGCZGTTCAGCi^CATGCCGGAAG^ AAA GGC 
Ser Asp Thr Gly Val Leu Phe Ser Thr His Ala Gly Ser Gin Lys Gly 
85 90 95 



288 



CX3C GAA CTG ACA GAG AAC OCC TOG GCT TOG GOG ACG CTG TAT TOG OGC 
Arg Glu Leu Thr Glu Asn Pro Trp Ala Ser Gly Thr Leu Tyr Trp Arg 
100 105 110 



336 



GAA ACC AGC CAG CAG ATC ATC CTC AAT GGC CAG GCC GTG CGC ATG OCG 384 
Glu Thr Ser Gin Gin lie He Leu Asn Gly Gin Ala Val Arg Met Pro 
115 120 125 

GATGCCAAGOTGACGAGGCCTCGTrGAAGCGCOCTTATGCC AOG CAT 432 
Asp Ala Lys Ala Asp Glu Ala Trp Leu Lys Arg Pro 'Tyr Ala Thr His 
130 135 140 

OCG ATG TCA TOG GTG TCT OGC (3^ ACT GAA GAA CTC AAG GAT GTT CAA 480 
Pro Met Ser Ser Val Ser Arg Gin Ser Glu Glu Leu Lys Asp Val Gin 
145 150 155 160 

GCCATGCGCAACGCCGCCAGGGAACTGGCCGAGGTTCAAGCTOOGCIG 528 
Ala Met Arg Asn Ala Ala Arg Glu Leu Ala Glu Val Gin Gly Pro Leu 
165 170 175 

OCG OCT OCC GAG GCT TAT TGC GTG TTT GAG TTA OGG CTT GRk TOG CTG 576 
Pro Arg Pro Glu Gly Tyr Cys Val Phe Glu Leu Arg Leu Glu Ser Leu 
180 185 190 

GAG TTC TGG GCT AAC GGC GAG GAG OGC CTG CAT GAA OGC TTG CGC TAT 624 
Glu Phe Trp Gly Asn Gly Glu Glu Arg Leu His Glu Arg Leu Arg Tyr 
195 200 205 

GACOGCAGCGCTGAAGGCTGGAAACATOGCOGGTTACAGOCA TAOGGIOOOG 676 
Asp Arg Ser Ala Glu Gly Tap Lys His Arg Arg Leu Gin Pro 
210 215 220 



CGATAAACAT 


GCTTTGAACT GOCTGGCTGC TCCAGCTTCG AACTCATTGO GCAAACTTCA 


736 


ACACTTATQV 


CAOOOGGICA ACATGAGAAA AGICCAGATG OGAAAGAAOG OGIATTOGAA 


796 


AXACCAAACA 


GAGAGTCOGG ATCAOCAAAG TGIGXAAOGA CATTAACTCC TATCTGAATT 


856 


nATAGTFGC 


TCTAGAAOCT TGTOCTTQyC OCAGCGATAG ACATOGGGCC AGAACCIACA 


916 


TAAACAAACT 


CAGACATTAC TGAGGCTGCT ACCATGOTAG ATTTICAAAA CAAGOGXAAA 


976 



TAICTGAAAA GIGCAGAAIC CTTCAAAGCT T 



1007 
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(2) immKHlW FOR SEQ ID KO: 22: 

(i) SEQDENCC CHAKACTERISTICS: 

(A) LE27GTB: 222 amino acids 

(B) TYPE: axnino acid 
(D) TOPOLOGY: linear 

(ii) MDLBCQLE TYPE: protein 

(xi) SBQDENCE DESCZUPnON: SBQ ID NO: 22: 

Met Asn Ser Ser Val Leu Gly Lys Pro Leu Leu Gly Lys Gly Met Ser 
15 10 15 

Glu Ser Leu Thr Gly Thr Leu Asp Ala Pro Phe Pro Glu Tyr Gin Lys 
20 25 30 

Pro Pro Ala Asp Pro Met Ser Val Leu His Asn Tzp Leu Glu Arg Ala 
35 40 45 

Arg Arg Val Gly lie Arg Glu Pro Arg Ala Leu Ala Leu Ala Thr Ala 
50 55 60 

Asp Ser Gin Gly Arg Pro Ser Thr Arg Ue Val Val lie Ser Glu He 
65 70 75 80 

Ser Asp Thr Gly Val Leu Phe Ser Thr His Ala Gly Ser Gin Lys Gly 
85 90 95 

Arg Glu Leu Thr Glu Asn Pro Tip Ala Ser Gly Thr Leu Tyr Trp Arg 
100 105 110 

Glu Thr Ser Gin Gin He He Leu Asn Gly Gin Ala Val Arg Met Pro 
115 120 125 

Asp Ala Lys Ala Asp Glu Ala Trp Leu Lys Arg Pro Tyr Ala Thr His 
130 135 140 

Pro Met Ser Ser Val Ser Arg Gin Ser Glu Glu Leu Lys Asp Val Gin 
145 150 155 160 

Ala Met Arg Asn Ala Ala Arg Glu Leu Ala Glu Val Gin Gly Pro Leu 
165 170 175 

Pro Arg Pro Glu Gly Tyr Cys Val Phe Glu Leu Arg Leu Glu Ser Leu 
180 185 190 

Glu Phe Trp Gly Asn Gly Glu Glu Arg Leu His Glu Arg Leu Arg Tyr 
195 200 205 

Asp Arg Ser Ala Glu Gly Tip Lys His Arg Arg Leu Gin Pro 
210 215 220 
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What is daimed is : 

1 . An isolated DNA molecule encoding one or more polypeptides required for the 
biosynthesis of an antipathogenic substance (APS) in a heterologous host, wherein said 
APS is selected from the group consisting of pyrrolnitrin and soraphen. 

2. The isolated DNA molecule of claim 1 , wherein said APS is pyrrolnitrin and said 
polypeptide is selected from ttie group consisting of SEQ ID Nos. 2-5. 

3. The isolated DNA molecule of claim 1 , wherein said APS is pyrrolnitrin and said DNA 
molecule has the sequence set forth in SEQ ID No. 1. 

4. The isolated DNA molecule of claim 1 , wherein said APS is soraphen and said DNA 
molecule has the sequence set forth in SEQ ID No. 6. 

5. The DNA molecule according to any one of claims 1 to 4 engineered to fomri part of a 
plant genome. 

6. An expression vector comprising the isolated DNA molecule of claim 1 wherein said 
vector is capable of expressing one or more polypeptides encoded by said DNA molecule in 
a host cell. 

7. A heterologous host transformed wth an expression vector comprising the isolated DNA 
molecule of claim 1 , wherein said host Is selected from the group consisting of a bacterium, 
a fungus, a yeast and a plant. 

8. The heterologous host.of daim 7. wherein said host a plant 

9. A host capable of synthesizing an antipathogenic substance not naturally oocumng in 
said host 

10. The host of claim 9, wherein said antipathogenic substence is selected from the group 
consisting of a carbohydrate containing antibiotic, a peptide antibiotic, a heterocyclic 
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antibiotic containing nitrogen, a heterocyclic antibiotic containing oxygen, a heterocyclic 
antibiotic containing nitrogen and oxygen, a polyketide, a macrocyclic lactone, and a 
quinone. 

11. The host of claim 10, wherein said peptide antibiotic is rhizocticin. 

12. The host of claim 10, wherein said carbohydrate containing antibiotic is an 
aminoglycoside. 

13. The host of claim 10, wherein said antipathogenic substance is a heterocyclic antibiotic 
containing nitrogen. 

14. The host of claim 13, wherein said heterocyclic antibiotic containing nitrogen is selected 
from the group consisting of phenazine and pyn-olnitrin. 

15. The host of claim 10, wherein said antipathogenic substance is a polyketide. 

16. The host of claim 15, wherein said polyketide is soraphen. 

1 7. The host of claim 9, wherein said antipathogenic substance Is resordnol. 

1 8. The host of claim 9, wherein said antipathogenic substance is a methoxyacrylate. 

19. The host of claim 18, wherein said methoxyacrylate Is strobilurin B. 

20. The host of claim 9, wherein said host is selected from the group consisting of a plant, 
a bacterium, a yeast and a fungus. 

21 . The host of claim 20, wherein said host is a plant 

22. The host of claim 21 , wherein said host is a hybrid plant 
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23. Propagating material of a host according to claim 21 or 22 treated with a protectant 
coating. 

24. Propagating material according to claim 23, comprising a preparation selected from the 
group consisting of herbicides, insecticides, fungicides, bactericides, nematicides, 
molluscicides or mixtures thereof. 

25. Propagating material according to claim 23 or 24 characterized in that it consists of 
seed. 

26. The host of claim 20, wherein said host is a biocontroi agent 

27. The host of claim 20. wherein said host is a plant colonteing organism. 

28. The host of daim 20, wherein s^d host is suitable for producing large quantities of 
said APS. 

29. A host capable of synthesizing enhanced amounts of an antipathogenic substance 
naturally occum'ng in said host, wherein said host is transfonned with one or more DMA 
molecules collectively encoding the complete set of polypeptides required to synthesize 
said antipathogenic substance. 

30. A method for protecting a plant against a phytopathogen comprising transforming said 
plant with one or more vectors collectively capable of expressing all of the polypeptides 
necessary to produce an anti-phytopathogenic substance in saki plant in amounts which 
inhibit said phytopathogen. 

31 . A method for protecting a plant against a phytopathogen comprising treating said plant 
with a biocontroi agent transformed with one or more vectors collectively capable of 
expressing all of the polypeptides necessary to produce an anti*phytopathogenic substance 
in amounts which inhibit said phytopathogen. 

32. A method for protecting a plant against a phytopathogen comprising applying to said 
plant a composition comprising an anti-phytopathogenic substance in amounts which inhibit 
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said phylopathogen. wherein said anti-phytopatliogenic substance is obtained from the host 
of daim 28. 

33. A method for producing large quantities of an antipathogenic substance (APS) of 

unifonn chirality comprising 

(a) transfonning a host with one or more vectors collectively capable of 
expressing all of the polypeptides necessary to produce said APS in said host; 

(b) growing said host under concfitions which allow production of said APS; and 

(c) collecting ssdd APS from said host 

34. A composition comprising an antipathogenic substance (APS) of uniform dilrality 
produced by the method of claim 33. 

35. A method for identifying and isolating a gene from a microorganism required for the 
biosynthesis of an antipathogenic substance (APS), wherein tiie expression of said gene is 
under the control of a regulator of ttie biosyntiiesis of said APS, said metiiod comprising 

(a) cloning a library of genetic fragments from said microorganism into a vector 
adjacent to a promoterless reporter gene in a vector such that expression of said reporter 
gene can occur only if promoter function is provided by the cloned fragment; 

(b) transfonning ttie vectore generated from step (a) into a suitable host; 

(c) identifying tiiose transfonnants from step (b) which express said reporter gene 
only in tiie presence of said regulator; and 

(d) identifying and isolating the DNA fragment operably Pnked to the genetic fragment 
from said microorganism present in ttie transfonnants identified In step (c); 

wherein said DNA fragment isolated and identified in step (d) encodes one or more , 
polypeptides required for tiie biosyntiiesis of said APS. 
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36. An isolated polypeptide required for the biosynthesis of an antipathogenic substance 
(APS) in a heterologous host, wherein said APS is selected from the group consisting of 
pyrroinitrin and soraphen. 

37. The isolated polypeptide of daim 36, wherein said APS is pynrolnitrin and said 
polypeptide is selected from the group consisting of SEQ ID Nos. 2-5. 

38. The isolated polypeptide claim 36, wherein said APS is pyrroinitrin and said polypeptide 
is encoded by the nucleotide sequence set forth in SEQ ID No. 1 . 

39. The isolated polypeptide of claim 36, wherein said APS is soraphen and said 
polypeptide is encoded by the nucleotide sequence set torth in SEQ ID No. 6. 

40. Use of a DNA molecule according to claim 1 for genetically engineering a host 
organism to express said antipathogenic substance. 

41 . Use according to claim 40, wherein said host is selected from the group consisting of a 
plant, a bacterium, a yeast and a fungus. 

42. Use according to claim 40, wherein the antq>athogenic substance expressed does not 
naturally occur in said host 

43. Use accordirig to daim 40. wherein hcreased amounts of the antipathogenic substance 
naturally occumng in said host are produced. 

44. Use of the host according to claim 7 for protecting a plant against a phytopathogen. 

45. Use of the composition according to daim 34 for protecting a plant against a 
phytopathogen. 

46. Use of the DNA molecule according to daim 5 to transfer the ability to express an 
antipathogenic molecule from a parent plant to its progeny. 
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